[
  {
    "path": ".github/ISSUE_TEMPLATE/bug_report.md",
    "content": "---\nname: Bug report\nabout: The clearer your bug report, the faster it will be fixed!\ntitle: \"BUG: \"\nlabels: bug\nassignees: ''\n\n---\n\n<!--\nThank you for reporting a bug!\nPlease provide as much detail as possible to help us identify and fix the issue.\n-->\n\n**Problem**\nA clear and concise description of what the bug is.\n\n**Steps to reproduce**\nPlease provide a minimal, self-contained, and reproducible example of the bug.\n1.  ...\n2.  ...\n3.  ...\n\n```python\n# Your code to reproduce the bug here\n```\n\n**Actual behavior and error logs**\nA clear and concise description of what actually happened. Please include the full traceback if an exception was raised.\n```shell\n\n```\n\n**Expected behavior**\nA clear and concise description of what you expected to happen.\n\n**Environment:**\nPlease complete the following information:\n- OS: [e.g., Linux, macOS, Windows]\n- Python version: [e.g., 3.10]\n- Package version: (output of `pip show smolagents`)\n```\n\n```\n\n**Additional context (optional)**\nAdd any other context, screenshots, or links about the bug here.\n\n---\n\n### Checklist\n- [ ] I have searched the existing issues and have not found a similar bug report.\n- [ ] I have provided a minimal, reproducible example.\n- [ ] I have provided the full traceback of the error.\n- [ ] I have provided my environment details.\n- [ ] I am willing to work on this issue and submit a pull request. (optional)\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/doc_improvement.md",
    "content": "---\nname: Documentation Improvement\nabout: Report wrong or missing documentation\ntitle: 'DOC: '\nlabels: documentation\nassignees: ''\n\n---\n\n<!--\nThank you for contributing to our documentation!\nPlease provide as much detail as possible.\n-->\n\n**Problem**\nA clear and description of what is wrong or missing in the documentation.\n\n**Location of the documentation**\nProvide the specific location of the documentation that needs improvement. Select what is applicable:\n- Function/Class/Method name:  (if applicable: e.g., `module.ClassName.method_name`)\n- URL: (if applicable: e.g. `https://huggingface.co/docs/smolagents/installation`)\n\n**Suggested improvement**\nA clear and concise description of the fix and improvement you suggest and why it is better.\n\n**Additional context (optional)**\nAdd any other context or screenshots about the documentation improvement here.\n\n---\n\n### Checklist\n- [ ] I have searched the existing issues and have not found a similar issue.\n- [ ] I am willing to work on this issue and submit a pull request. (optional)\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/feature_request.md",
    "content": "---\nname: Feature request\nabout: Suggest an idea for this project\ntitle: 'ENH: '\nlabels: enhancement\nassignees: ''\n\n---\n\n<!--\nThank you for suggesting an idea to improve this project!\nPlease provide as much detail as possible.\n-->\n\n**Problem**\nA clear and concise description of what the problem is that this feature would solve. For example, \"I'm always frustrated when...\"\n\n**Proposed solution**\nA clear and concise description of what you want to happen.\n\n**Is this not possible with the current options.**\nMake sure to consider if what you're requesting can be done with current abstractions.\n\n**Alternatives considered**\nA clear and concise description of any alternative solutions or features you've considered.\nPlease also describe if what you're requesting can be achieved with the current abstractions, and if so, why a new feature is still needed.\n\n**Additional context (optional)**\nAdd any other context, screenshots, or links about the feature request here.\n\n---\n\n### Checklist\n- [ ] I have searched the existing issues and have not found a similar feature request.\n- [ ] I have verified that this feature is not already implemented in the latest version.\n- [ ] I am willing to work on this feature and submit a pull request. (optional)\n"
  },
  {
    "path": ".github/workflows/build_documentation.yml",
    "content": "name: Build documentation\n\non:\n  push:\n    branches:\n      - main\n      - doc-builder*\n      - v*-release\n      - use_templates\n    paths:\n      - 'docs/source/**'\n      - 'assets/**'\n      - '.github/workflows/doc-build.yml'\n      - 'pyproject.toml'\n\njobs:\n   build:\n    uses: huggingface/doc-builder/.github/workflows/build_main_documentation.yml@main\n    with:\n      commit_sha: ${{ github.sha }}\n      package: smolagents\n      languages: en hi ko zh\n      notebook_folder: smolagents_doc\n      # additional_args: --not_python_module # use this arg if repository is documentation only\n    secrets:\n      token: ${{ secrets.HUGGINGFACE_PUSH }}\n      hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}"
  },
  {
    "path": ".github/workflows/build_pr_documentation.yml",
    "content": "name: Build PR Documentation\n\non:\n  pull_request:\n    paths:\n      - 'docs/source/**'\n      - 'assets/**'\n      - '.github/workflows/doc-pr-build.yml'\n\nconcurrency:\n  group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}\n  cancel-in-progress: true\n\njobs:\n  build:\n    uses: huggingface/doc-builder/.github/workflows/build_pr_documentation.yml@main\n    with:\n      commit_sha: ${{ github.event.pull_request.head.sha }}\n      pr_number: ${{ github.event.number }}\n      package: smolagents\n      languages: en hi ko zh\n      # additional_args: --not_python_module # use this arg if repository is documentation only"
  },
  {
    "path": ".github/workflows/quality.yml",
    "content": "name: Quality Check\n\non: [pull_request]\n\nenv:\n  UV_SYSTEM_PYTHON: 1\n\njobs:\n  check_code_quality:\n    runs-on: ubuntu-latest\n    env:\n      UV_HTTP_TIMEOUT: 600 # max 10min to install deps\n\n    steps:\n      - uses: actions/checkout@v6\n      - name: Set up Python\n        uses: actions/setup-python@v6\n        with:\n          python-version: \"3.12\"\n\n      # Setup venv\n      - name: Setup uv\n        run: |\n          pip install --upgrade uv\n\n      - name: Install dependencies\n        run: uv pip install \"smolagents[quality] @ .\"\n\n      # Equivalent of \"make quality\" but step by step\n      - run: ruff check examples src tests  # linter\n      - run: ruff format --check examples src tests  # formatter\n"
  },
  {
    "path": ".github/workflows/tests.yml",
    "content": "name: Python tests\n\non:\n  pull_request:\n  push:\n    branches:\n      - ci-*\n\nenv:\n  UV_SYSTEM_PYTHON: 1\n\njobs:\n  build-ubuntu:\n    runs-on: ubuntu-latest\n    env:\n      UV_HTTP_TIMEOUT: 600 # max 10min to install deps\n\n    strategy:\n      fail-fast: false\n      matrix:\n        python-version: [\"3.10\", \"3.12\"]\n\n    steps:\n      - uses: actions/checkout@v6\n      - name: Set up Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v6\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      # Setup venv\n      - name: Setup uv\n        run: |\n          pip install --upgrade uv\n\n      # Install dependencies\n      - name: Install dependencies\n        run: |\n          uv pip install \"smolagents[test] @ .\"\n\n      # Run tests\n      - name: Test with pytest\n        run: |\n          pytest ./tests/\n"
  },
  {
    "path": ".github/workflows/trufflehog.yml",
    "content": "on:\n  push:\n\nname: Secret Leaks\n\npermissions:\n  contents: read\n\njobs:\n  trufflehog:\n    runs-on: ubuntu-latest\n    steps:\n      - name: Checkout code\n        uses: actions/checkout@v6\n        with:\n          fetch-depth: 0\n      - name: Secret Scanning\n        uses: trufflesecurity/trufflehog@main"
  },
  {
    "path": ".github/workflows/upload_pr_documentation.yml",
    "content": "name: Upload PR Documentation\n\non:\n  workflow_run:\n    workflows: [\"Build PR Documentation\"]\n    types:\n      - completed\n\njobs:\n  build:\n    uses: huggingface/doc-builder/.github/workflows/upload_pr_documentation.yml@main\n    with:\n      package_name: smolagents\n    secrets:\n      hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}\n      comment_bot_token: ${{ secrets.COMMENT_BOT_TOKEN }}"
  },
  {
    "path": ".gitignore",
    "content": "# Logging\nlogs\ntmp\nwandb\n\n# Data\ndata\noutputs\ndata/\n\n# Apple\n.DS_Store\n\n# VS Code\n.vscode\n\n# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packaging\n.Python\nbuild/\ndevelop-eggs/\ndist/\ndownloads/\neggs/\n.eggs/\nlib/\nlib64/\nparts/\nsdist/\nvar/\nwheels/\nshare/python-wheels/\nnode_modules/\n*.egg-info/\n.installed.cfg\n*.egg\nMANIFEST\n\n# PyInstaller\n*.manifest\n*.spec\n\n# Installer logs\npip-log.txt\npip-delete-this-directory.txt\n\n# Unit test / coverage reports\nhtmlcov/\n.tox/\n.nox/\n.coverage\n.coverage.*\n.cache\nnosetests.xml\ncoverage.xml\n*.cover\n*.py,cover\n.hypothesis/\n.pytest_cache/\ncover/\nuv.lock\n\n# Translations\n*.mo\n*.pot\n\n# Sphinx documentation\ndocs/_build/\n\n# PyBuilder\n.pybuilder/\ntarget/\n\n# Jupyter Notebook\n.ipynb_checkpoints\n\n# IPython\nprofile_default/\nipython_config.py\n\n# pyenv\n# .python-version\n\n# pipenv\n#Pipfile.lock\n\n# poetry\n#poetry.lock\n\n# pdm\n.pdm.toml\n.pdm-python\n.pdm-build/\n\n# PEP 582\n__pypackages__/\n\n# Celery stuff\ncelerybeat-schedule\ncelerybeat.pid\n\n# SageMath parsed files\n*.sage.py\n\n# Environments\n.env\n.venv\nenv/\nvenv/\nENV/\nenv.bak/\nvenv.bak/\n\n\n# mkdocs documentation\n/site\n\n# mypy\n.mypy_cache/\n.dmypy.json\ndmypy.json\n\n# Pyre type checker\n.pyre/\n\n# pytype static type analyzer\n.pytype/\n\n# Cython debug symbols\ncython_debug/\n\n# PyCharm\n.idea/\n\n# Interpreter\ninterpreter_workspace/\n\n# Archive\narchive/\nsavedir/\noutput/\ntool_output/\n\n# Gradio runtime\n.gradio/"
  },
  {
    "path": ".pre-commit-config.yaml",
    "content": "repos:\n  - repo: https://github.com/astral-sh/ruff-pre-commit\n    rev: v0.2.1\n    hooks:\n      - id: ruff\n        args:\n          - --fix\n      - id: ruff-format\n  - repo: https://github.com/pre-commit/pre-commit-hooks\n    rev: v4.5.0\n    hooks:\n      - id: check-merge-conflict\n      - id: check-yaml\n"
  },
  {
    "path": "AGENTS.md",
    "content": "# Contributor Guidelines\n- Follow OOP principles\n- Be Pythonic: follow Python best practices and idiomatic patterns\n- Write unit tests for new functionality\n"
  },
  {
    "path": "CODE_OF_CONDUCT.md",
    "content": "\n# Contributor Covenant Code of Conduct\n\n## Our Pledge\n\nWe as members, contributors, and leaders pledge to make participation in our\ncommunity a harassment-free experience for everyone, regardless of age, body\nsize, visible or invisible disability, ethnicity, sex characteristics, gender\nidentity and expression, level of experience, education, socio-economic status,\nnationality, personal appearance, race, caste, color, religion, or sexual\nidentity and orientation.\n\nWe pledge to act and interact in ways that contribute to an open, welcoming,\ndiverse, inclusive, and healthy community.\n\n## Our Standards\n\nExamples of behavior that contributes to a positive environment for our\ncommunity include:\n\n* Demonstrating empathy and kindness toward other people\n* Being respectful of differing opinions, viewpoints, and experiences\n* Giving and gracefully accepting constructive feedback\n* Accepting responsibility and apologizing to those affected by our mistakes,\n  and learning from the experience\n* Focusing on what is best not just for us as individuals, but for the overall\n  community\n\nExamples of unacceptable behavior include:\n\n* The use of sexualized language or imagery, and sexual attention or advances of\n  any kind\n* Trolling, insulting or derogatory comments, and personal or political attacks\n* Public or private harassment\n* Publishing others' private information, such as a physical or email address,\n  without their explicit permission\n* Other conduct which could reasonably be considered inappropriate in a\n  professional setting\n\n## Enforcement Responsibilities\n\nCommunity leaders are responsible for clarifying and enforcing our standards of\nacceptable behavior and will take appropriate and fair corrective action in\nresponse to any behavior that they deem inappropriate, threatening, offensive,\nor harmful.\n\nCommunity leaders have the right and responsibility to remove, edit, or reject\ncomments, commits, code, wiki edits, issues, and other contributions that are\nnot aligned to this Code of Conduct, and will communicate reasons for moderation\ndecisions when appropriate.\n\n## Scope\n\nThis Code of Conduct applies within all community spaces, and also applies when\nan individual is officially representing the community in public spaces.\nExamples of representing our community include using an official e-mail address,\nposting via an official social media account, or acting as an appointed\nrepresentative at an online or offline event.\n\n## Enforcement\n\nInstances of abusive, harassing, or otherwise unacceptable behavior may be\nreported to the community leaders responsible for enforcement at\nfeedback@huggingface.co.\nAll complaints will be reviewed and investigated promptly and fairly.\n\nAll community leaders are obligated to respect the privacy and security of the\nreporter of any incident.\n\n## Enforcement Guidelines\n\nCommunity leaders will follow these Community Impact Guidelines in determining\nthe consequences for any action they deem in violation of this Code of Conduct:\n\n### 1. Correction\n\n**Community Impact**: Use of inappropriate language or other behavior deemed\nunprofessional or unwelcome in the community.\n\n**Consequence**: A private, written warning from community leaders, providing\nclarity around the nature of the violation and an explanation of why the\nbehavior was inappropriate. A public apology may be requested.\n\n### 2. Warning\n\n**Community Impact**: A violation through a single incident or series of\nactions.\n\n**Consequence**: A warning with consequences for continued behavior. No\ninteraction with the people involved, including unsolicited interaction with\nthose enforcing the Code of Conduct, for a specified period of time. This\nincludes avoiding interactions in community spaces as well as external channels\nlike social media. Violating these terms may lead to a temporary or permanent\nban.\n\n### 3. Temporary Ban\n\n**Community Impact**: A serious violation of community standards, including\nsustained inappropriate behavior.\n\n**Consequence**: A temporary ban from any sort of interaction or public\ncommunication with the community for a specified period of time. No public or\nprivate interaction with the people involved, including unsolicited interaction\nwith those enforcing the Code of Conduct, is allowed during this period.\nViolating these terms may lead to a permanent ban.\n\n### 4. Permanent Ban\n\n**Community Impact**: Demonstrating a pattern of violation of community\nstandards, including sustained inappropriate behavior, harassment of an\nindividual, or aggression toward or disparagement of classes of individuals.\n\n**Consequence**: A permanent ban from any sort of public interaction within the\ncommunity.\n\n## Attribution\n\nThis Code of Conduct is adapted from the [Contributor Covenant][homepage],\nversion 2.1, available at\n[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].\n\nCommunity Impact Guidelines were inspired by\n[Mozilla's code of conduct enforcement ladder][Mozilla CoC].\n\nFor answers to common questions about this code of conduct, see the FAQ at\n[https://www.contributor-covenant.org/faq][FAQ]. Translations are available at\n[https://www.contributor-covenant.org/translations][translations].\n\n[homepage]: https://www.contributor-covenant.org\n[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html\n[Mozilla CoC]: https://github.com/mozilla/diversity\n[FAQ]: https://www.contributor-covenant.org/faq\n[translations]: https://www.contributor-covenant.org/translations"
  },
  {
    "path": "CONTRIBUTING.md",
    "content": "<!---\nCopyright 2025 The HuggingFace Team. All rights reserved.\n\nLicensed under the Apache License, Version 2.0 (the \"License\");\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n    http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an \"AS IS\" BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n-->\n\n# Contribute to smolagents\n\nEveryone is welcome to contribute, and we value everybody's contribution. Code\ncontributions are not the only way to help the community. Answering questions, helping\nothers, and improving the documentation are also immensely valuable.\n\nIt also helps us if you spread the word! Reference the library in blog posts\nabout the awesome projects it made possible, shout out on Twitter every time it has\nhelped you, or simply ⭐️ the repository to say thank you.\n\nHowever you choose to contribute, please be mindful and respect our\n[code of conduct](https://github.com/huggingface/smolagents/blob/main/CODE_OF_CONDUCT.md).\n\n**This guide was heavily inspired by the awesome [scikit-learn guide to contributing](https://github.com/scikit-learn/scikit-learn/blob/main/CONTRIBUTING.md).**\n\n## Ways to contribute\n\nThere are several ways you can contribute to smolagents.\n\n* Submit issues related to bugs or desired new features.\n* Contribute to the examples or to the documentation.\n* Fix outstanding issues with the existing code.\n\n> All contributions are equally valuable to the community. 🥰\n\n## Submitting a bug-related issue or feature request\n\nAt any moment, feel welcome to open an issue, citing your exact error traces and package versions if it's a bug.\nIt's often even better to open a PR with your proposed fixes/changes!\n\nDo your best to follow these guidelines when submitting a bug-related issue or a feature\nrequest. It will make it easier for us to come back to you quickly and with good\nfeedback.\n\n### Did you find a bug?\n\nThe smolagents library is robust and reliable thanks to users who report the problems they encounter.\n\nBefore you report an issue, we would really appreciate it if you could **make sure the bug was not\nalready reported** (use the search bar on GitHub under Issues). Your issue should also be related to bugs in the \nlibrary itself, and not your code. \n\nOnce you've confirmed the bug hasn't already been reported, please include the following information in your issue so \nwe can quickly resolve it:\n\n* Your **OS type and version**, as well as your environment versions (versions of rust, python, and dependencies).\n* A short, self-contained, code snippet that allows us to reproduce the bug.\n* The *full* traceback if an exception is raised.\n* Attach any other additional information, like screenshots, you think may help.\n\n### Do you want a new feature?\n\nIf there is a new feature you'd like to see in smolagents, please open an issue and describe:\n\n1. What is the *motivation* behind this feature? Is it related to a problem or frustration with the library? Is it \n   a feature related to something you need for a project? Is it something you worked on and think it could benefit \n   the community?\n\n   Whatever it is, we'd love to hear about it!\n\n2. Describe your requested feature in as much detail as possible. The more you can tell us about it, the better \n   we'll be able to help you.\n3. Provide a *code snippet* that demonstrates the feature's usage.\n4. If the feature is related to a paper, please include a link.\n\nIf your issue is well written we're already 80% of the way there by the time you create it.\n\n## Do you want to add documentation?\n\nWe're always looking for improvements to the documentation that make it more clear and accurate. Please let us know \nhow the documentation can be improved such as typos and any content that is missing, unclear or inaccurate. We'll be \nhappy to make the changes or help you make a contribution if you're interested!\n\n## Fixing outstanding issues\n\nIf you notice an issue with the existing code and have a fix in mind, feel free to [start contributing](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request) and open\na Pull Request!\n\n### Making code changes\n\nTo install dev dependencies, run:\n<details>\n<summary><strong>Using pip</strong></summary>\n\n```\npip install -e \".[dev]\"\n```\n\n</details>\n<details>\n<summary><strong>Using uv</strong></summary>\n\n```\nuv pip install -e \"smolagents[dev] @ .\"\n```\n\n</details>\n\nWhen making changes to the codebase, please check that it follows the repo's code quality requirements by running:\nTo check code quality of the source code:\n```\nmake quality\n```\n\nIf the checks fail, you can run the formatter with:\n```\nmake style\n```\n\nAnd commit the changes.\n\nTo run tests locally, run this command:\n```bash\nmake test\n```\n</details>\n\n## I want to become a maintainer of the project. How do I get there?\n\nsmolagents is a project led and managed by Hugging Face. We are more than\nhappy to have motivated individuals from other organizations join us as maintainers with the goal of helping smolagents\nmake a dent in the world of Agents.\n\nIf you are such an individual (or organization), please reach out to us and let's collaborate.\n"
  },
  {
    "path": "LICENSE",
    "content": "                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      \"License\" shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      \"Licensor\" shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      \"Legal Entity\" shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      \"control\" means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      \"You\" (or \"Your\") shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      \"Source\" form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      \"Object\" form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      \"Work\" shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      \"Derivative Works\" shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      \"Contribution\" shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, \"submitted\"\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as \"Not a Contribution.\"\n\n      \"Contributor\" shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a \"NOTICE\" text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an \"AS IS\" BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets \"[]\"\n      replaced with your own identifying information. (Don't include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same \"printed page\" as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright [yyyy] [name of copyright owner]\n\n   Licensed under the Apache License, Version 2.0 (the \"License\");\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an \"AS IS\" BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n"
  },
  {
    "path": "Makefile",
    "content": ".PHONY: quality style test docs\n\ncheck_dirs := examples src tests\n\n# Check code quality of the source code\nquality:\n\truff check $(check_dirs)\n\truff format --check $(check_dirs)\n\n# Format source code automatically\nstyle:\n\truff check $(check_dirs) --fix\n\truff format $(check_dirs)\n\t\n# Run smolagents tests\ntest:\n\tpytest ./tests/"
  },
  {
    "path": "README.md",
    "content": "<!---\nCopyright 2024 The HuggingFace Team. All rights reserved.\n\nLicensed under the Apache License, Version 2.0 (the \"License\");\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n    http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an \"AS IS\" BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n-->\n<p align=\"center\">\n    <!-- Uncomment when CircleCI is set up\n    <a href=\"https://circleci.com/gh/huggingface/accelerate\"><img alt=\"Build\" src=\"https://img.shields.io/circleci/build/github/huggingface/transformers/master\"></a>\n    -->\n    <a href=\"https://github.com/huggingface/smolagents/blob/main/LICENSE\"><img alt=\"License\" src=\"https://img.shields.io/github/license/huggingface/smolagents.svg?color=blue\"></a>\n    <a href=\"https://huggingface.co/docs/smolagents\"><img alt=\"Documentation\" src=\"https://img.shields.io/website/http/huggingface.co/docs/smolagents/index.html.svg?down_color=red&down_message=offline&up_message=online\"></a>\n    <a href=\"https://github.com/huggingface/smolagents/releases\"><img alt=\"GitHub release\" src=\"https://img.shields.io/github/release/huggingface/smolagents.svg\"></a>\n    <a href=\"https://github.com/huggingface/smolagents/blob/main/CODE_OF_CONDUCT.md\"><img alt=\"Contributor Covenant\" src=\"https://img.shields.io/badge/Contributor%20Covenant-v2.0%20adopted-ff69b4.svg\"></a>\n    <a href=\"https://deepwiki.com/huggingface/smolagents\"><img src=\"https://deepwiki.com/badge.svg\" alt=\"Ask DeepWiki\"></a>\n</p>\n\n<h3 align=\"center\">\n  <div style=\"display:flex;flex-direction:row;\">\n    <img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/smolagents.png\" alt=\"Hugging Face mascot as James Bond\" width=400px>\n    <p>Agents that think in code!</p>\n  </div>\n</h3>\n\n`smolagents` is a library that enables you to run powerful agents in a few lines of code. It offers:\n\n✨ **Simplicity**: the logic for agents fits in ~1,000 lines of code (see [agents.py](https://github.com/huggingface/smolagents/blob/main/src/smolagents/agents.py)). We kept abstractions to their minimal shape above raw code!\n\n🧑‍💻 **First-class support for Code Agents**. Our [`CodeAgent`](https://huggingface.co/docs/smolagents/reference/agents#smolagents.CodeAgent) writes its actions in code (as opposed to \"agents being used to write code\"). To make it secure, we support executing in sandboxed environments via [Blaxel](https://blaxel.ai), [E2B](https://e2b.dev/), [Modal](https://modal.com/), Docker, or Pyodide+Deno WebAssembly sandbox.\n\n🤗 **Hub integrations**: you can [share/pull tools or agents to/from the Hub](https://huggingface.co/docs/smolagents/reference/tools#smolagents.Tool.from_hub) for instant sharing of the most efficient agents!\n\n🌐 **Model-agnostic**: smolagents supports any LLM. It can be a local `transformers` or `ollama` model, one of [many providers on the Hub](https://huggingface.co/blog/inference-providers), or any model from OpenAI, Anthropic and many others via our [LiteLLM](https://www.litellm.ai/) integration.\n\n👁️ **Modality-agnostic**: Agents support text, vision, video, even audio inputs! Cf [this tutorial](https://huggingface.co/docs/smolagents/examples/web_browser) for vision.\n\n🛠️ **Tool-agnostic**: you can use tools from any [MCP server](https://huggingface.co/docs/smolagents/reference/tools#smolagents.ToolCollection.from_mcp), from [LangChain](https://huggingface.co/docs/smolagents/reference/tools#smolagents.Tool.from_langchain), you can even use a [Hub Space](https://huggingface.co/docs/smolagents/reference/tools#smolagents.Tool.from_space) as a tool.\n\nFull documentation can be found [here](https://huggingface.co/docs/smolagents/index).\n\n> [!NOTE]\n> Check the our [launch blog post](https://huggingface.co/blog/smolagents) to learn more about `smolagents`!\n\n## Quick demo\n\nFirst install the package with a default set of tools:\n```bash\npip install \"smolagents[toolkit]\"\n```\nThen define your agent, give it the tools it needs and run it!\n```py\nfrom smolagents import CodeAgent, WebSearchTool, InferenceClientModel\n\nmodel = InferenceClientModel()\nagent = CodeAgent(tools=[WebSearchTool()], model=model, stream_outputs=True)\n\nagent.run(\"How many seconds would it take for a leopard at full speed to run through Pont des Arts?\")\n```\n\nhttps://github.com/user-attachments/assets/84b149b4-246c-40c9-a48d-ba013b08e600\n\nYou can even share your agent to the Hub, as a Space repository:\n```py\nagent.push_to_hub(\"m-ric/my_agent\")\n\n# agent.from_hub(\"m-ric/my_agent\") to load an agent from Hub\n```\n\nOur library is LLM-agnostic: you could switch the example above to any inference provider.\n\n<details>\n<summary> <b>InferenceClientModel, gateway for all <a href=\"https://huggingface.co/docs/inference-providers/index\">inference providers</a> supported on HF</b></summary>\n\n```py\nfrom smolagents import InferenceClientModel\n\nmodel = InferenceClientModel(\n    model_id=\"deepseek-ai/DeepSeek-R1\",\n    provider=\"together\",\n)\n```\n</details>\n<details>\n<summary> <b>LiteLLM to access 100+ LLMs</b></summary>\n\n```py\nfrom smolagents import LiteLLMModel\n\nmodel = LiteLLMModel(\n    model_id=\"anthropic/claude-4-sonnet-latest\",\n    temperature=0.2,\n    api_key=os.environ[\"ANTHROPIC_API_KEY\"]\n)\n```\n</details>\n<details>\n<summary> <b>OpenAI-compatible servers: Together AI</b></summary>\n\n```py\nimport os\nfrom smolagents import OpenAIModel\n\nmodel = OpenAIModel(\n    model_id=\"deepseek-ai/DeepSeek-R1\",\n    api_base=\"https://api.together.xyz/v1/\", # Leave this blank to query OpenAI servers.\n    api_key=os.environ[\"TOGETHER_API_KEY\"], # Switch to the API key for the server you're targeting.\n)\n```\n</details>\n<details>\n<summary> <b>OpenAI-compatible servers: OpenRouter</b></summary>\n\n```py\nimport os\nfrom smolagents import OpenAIModel\n\nmodel = OpenAIModel(\n    model_id=\"openai/gpt-4o\",\n    api_base=\"https://openrouter.ai/api/v1\", # Leave this blank to query OpenAI servers.\n    api_key=os.environ[\"OPENROUTER_API_KEY\"], # Switch to the API key for the server you're targeting.\n)\n```\n\n</details>\n<details>\n<summary> <b>Local `transformers` model</b></summary>\n\n```py\nfrom smolagents import TransformersModel\n\nmodel = TransformersModel(\n    model_id=\"Qwen/Qwen3-Next-80B-A3B-Thinking\",\n    max_new_tokens=4096,\n    device_map=\"auto\"\n)\n```\n</details>\n<details>\n<summary> <b>Azure models</b></summary>\n\n```py\nimport os\nfrom smolagents import AzureOpenAIModel\n\nmodel = AzureOpenAIModel(\n    model_id = os.environ.get(\"AZURE_OPENAI_MODEL\"),\n    azure_endpoint=os.environ.get(\"AZURE_OPENAI_ENDPOINT\"),\n    api_key=os.environ.get(\"AZURE_OPENAI_API_KEY\"),\n    api_version=os.environ.get(\"OPENAI_API_VERSION\")    \n)\n```\n</details>\n<details>\n<summary> <b>Amazon Bedrock models</b></summary>\n\n```py\nimport os\nfrom smolagents import AmazonBedrockModel\n\nmodel = AmazonBedrockModel(\n    model_id = os.environ.get(\"AMAZON_BEDROCK_MODEL_ID\") \n)\n```\n</details>\n\n## CLI\n\nYou can run agents from CLI using two commands: `smolagent` and `webagent`.\n\n`smolagent` is a generalist command to run a multi-step `CodeAgent` that can be equipped with various tools.\n\n```bash\n# Run with direct prompt and options\nsmolagent \"Plan a trip to Tokyo, Kyoto and Osaka between Mar 28 and Apr 7.\"  --model-type \"InferenceClientModel\" --model-id \"Qwen/Qwen3-Next-80B-A3B-Thinking\" --imports pandas numpy --tools web_search\n\n# Run in interactive mode (launches setup wizard when no prompt provided)\nsmolagent\n```\n\nInteractive mode guides you through:\n- Agent type selection (CodeAgent vs ToolCallingAgent)  \n- Tool selection from available toolbox\n- Model configuration (type, ID, API settings)\n- Advanced options like additional imports\n- Task prompt input\n\nMeanwhile `webagent` is a specific web-browsing agent using [helium](https://github.com/mherrmann/helium) (read more [here](https://github.com/huggingface/smolagents/blob/main/src/smolagents/vision_web_browser.py)).\n\nFor instance:\n```bash\nwebagent \"go to xyz.com/men, get to sale section, click the first clothing item you see. Get the product details, and the price, return them. note that I'm shopping from France\" --model-type \"LiteLLMModel\" --model-id \"gpt-5\"\n```\n\n## How do Code agents work?\n\nOur [`CodeAgent`](https://huggingface.co/docs/smolagents/reference/agents#smolagents.CodeAgent) works mostly like classical ReAct agents - the exception being that the LLM engine writes its actions as Python code snippets.\n\n```mermaid\nflowchart TB\n    Task[User Task]\n    Memory[agent.memory]\n    Generate[Generate from agent.model]\n    Execute[Execute Code action - Tool calls are written as functions]\n    Answer[Return the argument given to 'final_answer']\n\n    Task -->|Add task to agent.memory| Memory\n\n    subgraph ReAct[ReAct loop]\n        Memory -->|Memory as chat messages| Generate\n        Generate -->|Parse output to extract code action| Execute\n        Execute -->|No call to 'final_answer' tool => Store execution logs in memory and keep running| Memory\n    end\n    \n    Execute -->|Call to 'final_answer' tool| Answer\n\n    %% Styling\n    classDef default fill:#d4b702,stroke:#8b7701,color:#ffffff\n    classDef io fill:#4a5568,stroke:#2d3748,color:#ffffff\n    \n    class Task,Answer io\n```\n\nActions are now Python code snippets. Hence, tool calls will be performed as Python function calls. For instance, here is how the agent can perform web search over several websites in one single action:\n```py\nrequests_to_search = [\"gulf of mexico america\", \"greenland denmark\", \"tariffs\"]\nfor request in requests_to_search:\n    print(f\"Here are the search results for {request}:\", web_search(request))\n```\n\nWriting actions as code snippets is demonstrated to work better than the current industry practice of letting the LLM output a dictionary of the tools it wants to call: [uses 30% fewer steps](https://huggingface.co/papers/2402.01030) (thus 30% fewer LLM calls) and [reaches higher performance on difficult benchmarks](https://huggingface.co/papers/2411.01747). Head to [our high-level intro to agents](https://huggingface.co/docs/smolagents/conceptual_guides/intro_agents) to learn more on that.\n\nSince code execution can be a serious security concern (arbitrary code execution!), **you should run agent code in a sandbox**. We support several options:\n  - [E2B](https://e2b.dev/), [Blaxel](https://blaxel.ai), [Modal](https://modal.com/) — managed cloud sandboxes, simplest to set up\n  - [Docker](https://www.docker.com/) — self-hosted container isolation\n  - Pyodide+Deno WebAssembly — lightweight sandbox for browser or edge environments\n\nThe built-in `LocalPythonExecutor` is **not a security sandbox**. It applies some restrictions but can be bypassed and must not be used as a security boundary.\n\nAlongside [`CodeAgent`](https://huggingface.co/docs/smolagents/reference/agents#smolagents.CodeAgent), we also provide the standard [`ToolCallingAgent`](https://huggingface.co/docs/smolagents/reference/agents#smolagents.ToolCallingAgent) which writes actions as JSON/text blobs. You can pick whichever style best suits your use case.\n\n## How smol is this library?\n\nWe strived to keep abstractions to a strict minimum: the main code in `agents.py` has <1,000 lines of code.\nStill, we implement several types of agents: `CodeAgent` writes its actions as Python code snippets, and the more classic `ToolCallingAgent` leverages built-in tool calling methods. We also have multi-agent hierarchies, import from tool collections, remote code execution, vision models...\n\nBy the way, why use a framework at all? Well, because a big part of this stuff is non-trivial. For instance, the code agent has to keep a consistent format for code throughout its system prompt, its parser, the execution. So our framework handles this complexity for you. But of course we still encourage you to hack into the source code and use only the bits that you need, to the exclusion of everything else!\n\n## How strong are open models for agentic workflows?\n\nWe've created [`CodeAgent`](https://huggingface.co/docs/smolagents/reference/agents#smolagents.CodeAgent) instances with some leading models, and compared them on [this benchmark](https://huggingface.co/datasets/m-ric/agents_medium_benchmark_2) that gathers questions from a few different benchmarks to propose a varied blend of challenges.\n\n[Find the benchmarking code here](https://github.com/huggingface/smolagents/blob/main/examples/smolagents_benchmark/run.py) for more detail on the agentic setup used, and see a comparison of using LLMs code agents compared to vanilla (spoilers: code agents works better).\n\n<p align=\"center\">\n    <img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/benchmark_code_agents.jpeg\" alt=\"benchmark of different models on agentic workflows. Open model DeepSeek-R1 beats closed-source models.\" width=60% max-width=500px>\n</p>\n\nThis comparison shows that open-source models can now take on the best closed models!\n\n## Security\n\nSecurity is a critical consideration when working with code-executing agents. Ensure you are using one of the sandboxed execution options that provide isolation from untrusted code.\n\n**Warning:** `LocalPythonExecutor` provides best-effort mitigations only and is **not a security boundary**. Do not use it to run untrusted code.\n\nFor security policies, vulnerability reporting, and more information on secure agent execution, please see our [Security Policy](SECURITY.md).\n\n## Contribute\n\nEveryone is welcome to contribute, get started with our [contribution guide](https://github.com/huggingface/smolagents/blob/main/CONTRIBUTING.md).\n\n## Cite smolagents\n\nIf you use `smolagents` in your publication, please cite it by using the following BibTeX entry.\n\n```bibtex\n@Misc{smolagents,\n  title =        {`smolagents`: a smol library to build great agentic systems.},\n  author =       {Aymeric Roucher and Albert Villanova del Moral and Thomas Wolf and Leandro von Werra and Erik Kaunismäki},\n  howpublished = {\\url{https://github.com/huggingface/smolagents}},\n  year =         {2025}\n}\n```\n"
  },
  {
    "path": "SECURITY.md",
    "content": "# Security Policy\n\n## Reporting a Vulnerability\n\nTo report a security vulnerability, please contact: security@huggingface.co\n\n## Learning More About Security\n\nTo learn more about running agents more securely, please see the [Secure Code Execution tutorial](docs/source/en/tutorials/secure_code_execution.mdx) which covers sandboxing with E2B, Docker, and WebAssembly.\n\n### Secure Execution Options\n\n`smolagents` provides several options for secure code execution:\n\n1. **E2B Sandbox**: Uses [E2B](https://e2b.dev/) to run code in a secure, isolated environment.\n\n2. **Modal Sandbox**: Uses [Modal](https://modal.com/) to run code in a secure, isolated environment.\n\n3. **Docker Sandbox**: Runs code in an isolated Docker container.\n\n4. **WebAssembly Sandbox**: Executes Python code securely in a sandboxed WebAssembly environment using Pyodide and Deno's secure runtime.\n\nWe recommend using one of these sandboxed execution options when running untrusted code.\n"
  },
  {
    "path": "docs/README.md",
    "content": "<!---\nCopyright 2024 The HuggingFace Team. All rights reserved.\n\nLicensed under the Apache License, Version 2.0 (the \"License\");\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n    http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an \"AS IS\" BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n-->\n\n# Generating the documentation\n\nTo generate the documentation, you have to build it. Several packages are necessary to build the doc.\n\nFirst, you need to install the project itself by running the following command at the root of the code repository:\n\n```bash\npip install -e .\n```\n\nYou also need to install 2 extra packages:\n\n```bash\n# `hf-doc-builder` to build the docs\npip install git+https://github.com/huggingface/doc-builder@main\n```\n\n---\n**NOTE**\n\nYou only need to generate the documentation to inspect it locally (if you're planning changes and want to\ncheck how they look before committing for instance). You don't have to commit the built documentation.\n\n---\n\n## Building the documentation\n\nOnce you have setup the `doc-builder` and additional packages with the pip install command above,\nyou can generate the documentation by typing the following command:\n\n```bash\ndoc-builder build smolagents docs/source/en/ --build_dir ~/tmp/test-build\n```\n\nYou can adapt the `--build_dir` to set any temporary folder that you prefer. This command will create it and generate\nthe MDX files that will be rendered as the documentation on the main website. You can inspect them in your favorite\nMarkdown editor.\n\n## Previewing the documentation\n\nTo preview the docs, run the following command:\n\n```bash\ndoc-builder preview smolagents docs/source/en/\n```\n\nThe docs will be viewable at [http://localhost:5173](http://localhost:5173). You can also preview the docs once you\nhave opened a PR. You will see a bot add a comment to a link where the documentation with your changes lives.\n\n---\n**NOTE**\n\nThe `preview` command only works with existing doc files. When you add a completely new file, you need to update\n`_toctree.yml` & restart `preview` command (`ctrl-c` to stop it & call `doc-builder preview ...` again).\n\n---\n\n## Adding a new element to the navigation bar\n\nAccepted files are Markdown (.md).\n\nCreate a file with its extension and put it in the source directory. You can then link it to the toc-tree by putting\nthe filename without the extension in the [`_toctree.yml`](https://github.com/huggingface/smolagents/blob/main/docs/source/_toctree.yml) file.\n\n## Renaming section headers and moving sections\n\nIt helps to keep the old links working when renaming the section header and/or moving sections from one document to another. This is because the old links are likely to be used in Issues, Forums, and Social media and it'd make for a much more superior user experience if users reading those months later could still easily navigate to the originally intended information.\n\nTherefore, we simply keep a little map of moved sections at the end of the document where the original section was. The key is to preserve the original anchor.\n\nSo if you renamed a section from: \"Section A\" to \"Section B\", then you can add at the end of the file:\n\n```\nSections that were moved:\n\n[ <a href=\"#section-b\">Section A</a><a id=\"section-a\"></a> ]\n```\nand of course, if you moved it to another file, then:\n\n```\nSections that were moved:\n\n[ <a href=\"../new-file#section-b\">Section A</a><a id=\"section-a\"></a> ]\n```\n\nUse the relative style to link to the new file so that the versioned docs continue to work.\n\nFor an example of a rich moved section set please see the very end of [the transformers Trainer doc](https://github.com/huggingface/transformers/blob/main/docs/source/en/main_classes/trainer.md).\n\n\n## Writing Documentation - Specification\n\nThe `huggingface/smolagents` documentation follows the\n[Google documentation](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html) style for docstrings,\nalthough we can write them directly in Markdown.\n\n### Adding a new tutorial\n\nAdding a new tutorial or section is done in two steps:\n\n- Add a new Markdown (.md) file under `./source`.\n- Link that file in `./source/_toctree.yml` on the correct toc-tree.\n\nMake sure to put your new file under the proper section. If you have a doubt, feel free to ask in a Github Issue or PR.\n\n### Writing source documentation\n\nValues that should be put in `code` should either be surrounded by backticks: \\`like so\\`. Note that argument names\nand objects like True, None, or any strings should usually be put in `code`.\n\nWhen mentioning a class, function, or method, it is recommended to use our syntax for internal links so that our tool\nadds a link to its documentation with this syntax: \\[\\`XXXClass\\`\\] or \\[\\`function\\`\\]. This requires the class or\nfunction to be in the main package.\n\nIf you want to create a link to some internal class or function, you need to\nprovide its path. For instance: \\[\\`utils.ModelOutput\\`\\]. This will be converted into a link with\n`utils.ModelOutput` in the description. To get rid of the path and only keep the name of the object you are\nlinking to in the description, add a ~: \\[\\`~utils.ModelOutput\\`\\] will generate a link with `ModelOutput` in the description.\n\nThe same works for methods so you can either use \\[\\`XXXClass.method\\`\\] or \\[~\\`XXXClass.method\\`\\].\n\n#### Defining arguments in a method\n\nArguments should be defined with the `Args:` (or `Arguments:` or `Parameters:`) prefix, followed by a line return and\nan indentation. The argument should be followed by its type, with its shape if it is a tensor, a colon, and its\ndescription:\n\n```\n    Args:\n        n_layers (`int`): The number of layers of the model.\n```\n\nIf the description is too long to fit in one line, another indentation is necessary before writing the description\nafter the argument.\n\nHere's an example showcasing everything so far:\n\n```\n    Args:\n        input_ids (`torch.LongTensor` of shape `(batch_size, sequence_length)`):\n            Indices of input sequence tokens in the vocabulary.\n\n            Indices can be obtained using [`AlbertTokenizer`]. See [`~PreTrainedTokenizer.encode`] and\n            [`~PreTrainedTokenizer.__call__`] for details.\n\n            [What are input IDs?](../glossary#input-ids)\n```\n\nFor optional arguments or arguments with defaults we follow the following syntax: imagine we have a function with the\nfollowing signature:\n\n```\ndef my_function(x: str = None, a: float = 1):\n```\n\nthen its documentation should look like this:\n\n```\n    Args:\n        x (`str`, *optional*):\n            This argument controls ...\n        a (`float`, *optional*, defaults to 1):\n            This argument is used to ...\n```\n\nNote that we always omit the \"defaults to \\`None\\`\" when None is the default for any argument. Also note that even\nif the first line describing your argument type and its default gets long, you can't break it on several lines. You can\nhowever write as many lines as you want in the indented description (see the example above with `input_ids`).\n\n#### Writing a multi-line code block\n\nMulti-line code blocks can be useful for displaying examples. They are done between two lines of three backticks as usual in Markdown:\n\n\n````\n```\n# first line of code\n# second line\n# etc\n```\n````\n\n#### Writing a return block\n\nThe return block should be introduced with the `Returns:` prefix, followed by a line return and an indentation.\nThe first line should be the type of the return, followed by a line return. No need to indent further for the elements\nbuilding the return.\n\nHere's an example of a single value return:\n\n```\n    Returns:\n        `List[int]`: A list of integers in the range [0, 1] --- 1 for a special token, 0 for a sequence token.\n```\n\nHere's an example of a tuple return, comprising several objects:\n\n```\n    Returns:\n        `tuple(torch.FloatTensor)` comprising various elements depending on the configuration ([`BertConfig`]) and inputs:\n        - ** loss** (*optional*, returned when `masked_lm_labels` is provided) `torch.FloatTensor` of shape `(1,)` --\n          Total loss is the sum of the masked language modeling loss and the next sequence prediction (classification) loss.\n        - **prediction_scores** (`torch.FloatTensor` of shape `(batch_size, sequence_length, config.vocab_size)`) --\n          Prediction scores of the language modeling head (scores for each vocabulary token before SoftMax).\n```\n\n#### Adding an image\n\nDue to the rapidly growing repository, it is important to make sure that no files that would significantly weigh down the repository are added. This includes images, videos, and other non-text files. We prefer to leverage a hf.co hosted `dataset` like\nthe ones hosted on [`hf-internal-testing`](https://huggingface.co/hf-internal-testing) in which to place these files and reference\nthem by URL. We recommend putting them in the following dataset: [huggingface/documentation-images](https://huggingface.co/datasets/huggingface/documentation-images).\nIf an external contribution, feel free to add the images to your PR and ask a Hugging Face member to migrate your images\nto this dataset.\n\n#### Writing documentation examples\n\nThe syntax for Example docstrings can look as follows:\n\n```\n    Example:\n\n    ```python\n    >>> from transformers import Wav2Vec2Processor, Wav2Vec2ForCTC\n    >>> from datasets import load_dataset\n    >>> import torch\n\n    >>> dataset = load_dataset(\"hf-internal-testing/librispeech_asr_demo\", \"clean\", split=\"validation\")\n    >>> dataset = dataset.sort(\"id\")\n    >>> sampling_rate = dataset.features[\"audio\"].sampling_rate\n\n    >>> processor = Wav2Vec2Processor.from_pretrained(\"facebook/wav2vec2-base-960h\")\n    >>> model = Wav2Vec2ForCTC.from_pretrained(\"facebook/wav2vec2-base-960h\")\n\n    >>> # audio file is decoded on the fly\n    >>> inputs = processor(dataset[0][\"audio\"][\"array\"], sampling_rate=sampling_rate, return_tensors=\"pt\")\n    >>> with torch.no_grad():\n    ...     logits = model(**inputs).logits\n    >>> predicted_ids = torch.argmax(logits, dim=-1)\n\n    >>> # transcribe speech\n    >>> transcription = processor.batch_decode(predicted_ids)\n    >>> transcription[0]\n    'MISTER QUILTER IS THE APOSTLE OF THE MIDDLE CLASSES AND WE ARE GLAD TO WELCOME HIS GOSPEL'\n    ```\n```\n\nThe docstring should give a minimal, clear example of how the respective model\nis to be used in inference and also include the expected (ideally sensible)\noutput.\nOften, readers will try out the example before even going through the function\nor class definitions. Therefore, it is of utmost importance that the example\nworks as expected.\n\n"
  },
  {
    "path": "docs/source/en/_config.py",
    "content": "# docstyle-ignore\nINSTALL_CONTENT = \"\"\"\n# Installation\n! pip install smolagents\n# To install from source instead of the last release, comment the command above and uncomment the following one.\n# ! pip install git+https://github.com/huggingface/smolagents.git\n\"\"\"\n\nnotebook_first_cells = [{\"type\": \"code\", \"content\": INSTALL_CONTENT}]\nblack_avoid_patterns = {\n    \"{processor_class}\": \"FakeProcessorClass\",\n    \"{model_class}\": \"FakeModelClass\",\n    \"{object_class}\": \"FakeObjectClass\",\n}\n"
  },
  {
    "path": "docs/source/en/_toctree.yml",
    "content": "- title: Get started\n  sections:\n  - local: index\n    title: Introduction\n  - local: installation\n    title: Installation options\n  - local: guided_tour\n    title: Guided tour\n- title: Tutorials\n  sections:\n  - local: tutorials/building_good_agents\n    title: ✨ Building good agents\n  - local: tutorials/inspect_runs\n    title: 📊 Inspect your agent runs using telemetry\n  - local: tutorials/tools\n    title: 🛠️ Tools - in-depth guide\n  - local: tutorials/secure_code_execution\n    title: 🛡️ Secure code execution\n  - local: tutorials/memory\n    title: 📚 Manage your agent's memory\n- title: Conceptual guides\n  sections:\n  - local: conceptual_guides/intro_agents\n    title: 🤖 What are agents?\n  - local: conceptual_guides/react\n    title: 🤔 How do Multi-step agents work?\n- title: Examples\n  sections:\n  - local: examples/text_to_sql\n    title: Self-correcting Text-to-SQL\n  - local: examples/rag\n    title: Master your knowledge base with agentic RAG\n  - local: examples/multiagents\n    title: Orchestrate a multi-agent system\n  - local: examples/web_browser\n    title: Build a web browser agent using vision models\n  - local: examples/using_different_models\n    title: Using different models\n  - local: examples/plan_customization\n    title: \"Human-in-the-Loop: Customize agent plan interactively\"\n  - local: examples/async_agent\n    title: Async Applications with Agents\n- title: Reference\n  sections:\n  - title: Agent-related objects\n    sections:\n    - title: Agents\n      local: reference/agents\n    - title: Python executors\n      local: reference/python_executors\n  - local: reference/models\n    title: Model-related objects\n  - title: Tools\n    sections:\n    - title: Tool-related objects\n      local: reference/tools\n    - title: Built-in Tools\n      local: reference/default_tools\n"
  },
  {
    "path": "docs/source/en/conceptual_guides/intro_agents.md",
    "content": "# What are agents? 🤔\n\n## An introduction to agentic systems.\n\nAny efficient system using AI will need to provide LLMs some kind of access to the real world: for instance the possibility to call a search tool to get external information, or to act on certain programs in order to solve a task. In other words, LLMs should have ***agency***. Agentic programs are the gateway to the outside world for LLMs.\n\n> [!TIP]\n> AI Agents are **programs where LLM outputs control the workflow**.\n\nAny system leveraging LLMs will integrate the LLM outputs into code. The influence of the LLM's input on the code workflow is the level of agency of LLMs in the system.\n\nNote that with this definition, \"agent\" is not a discrete, 0 or 1 definition: instead, \"agency\" evolves on a continuous spectrum, as you give more or less power to the LLM on your workflow.\n\nSee in the table below how agency can vary across systems:\n\n| Agency Level | Description                                                     | Short name       | Example Code                                       |\n| ------------ | --------------------------------------------------------------- | ---------------- | -------------------------------------------------- |\n| ☆☆☆          | LLM output has no impact on program flow                        | Simple processor | `process_llm_output(llm_response)`                 |\n| ★☆☆          | LLM output controls an if/else switch                           | Router           | `if llm_decision(): path_a() else: path_b()`       |\n| ★★☆          | LLM output controls function execution                          | Tool call        | `run_function(llm_chosen_tool, llm_chosen_args)`   |\n| ★★☆          | LLM output controls iteration and program continuation          | Multi-step Agent | `while llm_should_continue(): execute_next_step()` |\n| ★★★          | One agentic workflow can start another agentic workflow         | Multi-Agent      | `if llm_trigger(): execute_agent()`                |\n| ★★★          | LLM acts in code, can define its own tools / start other agents | Code Agents      | `def custom_tool(args): ...`                       |\n\nThe multi-step agent has this code structure:\n\n```python\nmemory = [user_defined_task]\nwhile llm_should_continue(memory): # this loop is the multi-step part\n    action = llm_get_next_action(memory) # this is the tool-calling part\n    observations = execute_action(action)\n    memory += [action, observations]\n```\n\nThis agentic system runs in a loop, executing a new action at each step (the action can involve calling some pre-determined *tools* that are just functions), until its observations make it apparent that a satisfactory state has been reached to solve the given task. Here’s an example of how a multi-step agent can solve a simple math question:\n\n<div class=\"flex justify-center\">\n    <img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Agent_ManimCE.gif\"/>\n</div>\n\n\n## ✅ When to use agents / ⛔ when to avoid them\n\nAgents are useful when you need an LLM to determine the workflow of an app. But they’re often overkill. The question is: do I really need flexibility in the workflow to efficiently solve the task at hand?\nIf the pre-determined workflow falls short too often, that means you need more flexibility.\nLet's take an example: say you're making an app that handles customer requests on a surfing trip website.\n\nYou could know in advance that the requests will belong to either of 2 buckets (based on user choice), and you have a predefined workflow for each of these 2 cases.\n\n1. Want some knowledge on the trips? ⇒ give them access to a search bar to search your knowledge base\n2. Wants to talk to sales? ⇒ let them type in a contact form.\n\nIf that deterministic workflow fits all queries, by all means just code everything! This will give you a 100% reliable system with no risk of error introduced by letting unpredictable LLMs meddle in your workflow. For the sake of simplicity and robustness, it's advised to regularize towards not using any agentic behaviour. \n\nBut what if the workflow can't be determined that well in advance? \n\nFor instance, a user wants to ask: `\"I can come on Monday, but I forgot my passport so risk being delayed to Wednesday, is it possible to take me and my stuff to surf on Tuesday morning, with a cancellation insurance?\"` This question hinges on many factors, and probably none of the predetermined criteria above will suffice for this request.\n\nIf the pre-determined workflow falls short too often, that means you need more flexibility.\n\nThat is where an agentic setup helps.\n\nIn the above example, you could just make a multi-step agent that has access to a weather API for weather forecasts, Google Maps API to compute travel distance, an employee availability dashboard and a RAG system on your knowledge base.\n\nUntil recently, computer programs were restricted to pre-determined workflows, trying to handle complexity by piling up  if/else switches. They focused on extremely narrow tasks, like \"compute the sum of these numbers\" or \"find the shortest path in this graph\". But actually, most real-life tasks, like our trip example above, do not fit in pre-determined workflows. Agentic systems open up the vast world of real-world tasks to programs!\n\n## Why `smolagents`?\n\nFor some low-level agentic use cases, like chains or routers, you can write all the code yourself. You'll be much better that way, since it will let you control and understand your system better.\n\nBut once you start going for more complicated behaviours like letting an LLM call a function (that's \"tool calling\") or letting an LLM run a while loop (\"multi-step agent\"), some abstractions become necessary:\n- For tool calling, you need to parse the agent's output, so this output needs a predefined format like \"Thought: I should call tool 'get_weather'. Action: get_weather(Paris).\", that you parse with a predefined function, and system prompt given to the LLM should notify it about this format.\n- For a multi-step agent where the LLM output determines the loop, you need to give a different prompt to the LLM based on what happened in the last loop iteration: so you need some kind of memory.\n\nSee? With these two examples, we already found the need for a few items to help us:\n\n- Of course, an LLM that acts as the engine powering the system\n- A list of tools that the agent can access\n- A system prompt guiding the LLM on the agent logic: ReAct loop of Reflection -> Action -> Observation, available tools, tool calling format to use...\n- A parser that extracts tool calls from the LLM output, in the format indicated by system prompt above.\n- A memory\n\nBut wait, since we give room to LLMs in decisions, surely they will make mistakes: so we need error logging and retry mechanisms.\n\nAll these elements need tight coupling to make a well-functioning system. That's why we decided we needed to make basic building blocks to make all this stuff work together.\n\n## Code agents\n\nIn a multi-step agent, at each step, the LLM can write an action, in the form of some calls to external tools. A common format (used by Anthropic, OpenAI, and many others) for writing these actions is generally different shades of \"writing actions as a JSON of tools names and arguments to use, which you then parse to know which tool to execute and with which arguments\".\n\n[Multiple](https://huggingface.co/papers/2402.01030) [research](https://huggingface.co/papers/2411.01747) [papers](https://huggingface.co/papers/2401.00812) have shown that having the LLMs actions written as code snippets is a more natural and flexible way of writing them.\n\nThe reason for this simply that *we crafted our code languages specifically to express the actions performed by a computer*.\nIn other words, our agent is going to write programs in order to solve the user's issues : do you think their programming will be easier in blocks of Python or JSON?\n\nThe figure below, taken from [Executable Code Actions Elicit Better LLM Agents](https://huggingface.co/papers/2402.01030), illustrates some advantages of writing actions in code:\n\n<img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/code_vs_json_actions.png\">\n\nWriting actions in code rather than JSON-like snippets provides better:\n\n- **Composability:** could you nest JSON actions within each other, or define a set of JSON actions to re-use later, the same way you could just define a python function?\n- **Object management:** how do you store the output of an action like `generate_image` in JSON?\n- **Generality:** code is built to express simply anything you can have a computer do.\n- **Representation in LLM training data:** plenty of quality code actions are already included in LLMs’ training data which means they’re already trained for this!\n"
  },
  {
    "path": "docs/source/en/conceptual_guides/react.md",
    "content": "# How do multi-step agents work?\n\nThe ReAct framework ([Yao et al., 2022](https://huggingface.co/papers/2210.03629)) is currently the main approach to building agents.\n\nThe name is based on the concatenation of two words, \"Reason\" and \"Act.\" Indeed, agents following this architecture will solve their task in as many steps as needed, each step consisting of a Reasoning step, then an Action step where it formulates tool calls that will bring it closer to solving the task at hand.\n\nAll agents in `smolagents` are based on singular `MultiStepAgent` class, which is an abstraction of ReAct framework.\n\nOn a basic level, this class performs actions on a cycle of following steps, where existing variables and knowledge is incorporated into the agent logs like below: \n\nInitialization: the system prompt is stored in a `SystemPromptStep`, and the user query is logged into a `TaskStep` .\n\nWhile loop (ReAct loop):\n\n- Use `agent.write_memory_to_messages()` to write the agent logs into a list of LLM-readable [chat messages](https://huggingface.co/docs/transformers/en/chat_templating).\n- Send these messages to a `Model` object to get its completion. Parse the completion to get the action (a JSON blob for `ToolCallingAgent`, a code snippet for `CodeAgent`).\n- Execute the action and logs result into memory (an `ActionStep`).\n- At the end of each step, we run all callback functions defined in `agent.step_callbacks` .\n\nOptionally, when planning is activated, a plan can be periodically revised and stored in a `PlanningStep` . This includes feeding facts about the task at hand to the memory.\n\nFor a `CodeAgent`, it looks like the figure below.\n\n<div class=\"flex justify-center\">\n    <img\n        src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/codeagent_docs.png\"\n    />\n</div>\n\nHere is a video overview of how that works:\n\n<div class=\"flex justify-center\">\n    <img\n        class=\"block dark:hidden\"\n        src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Agent_ManimCE.gif\"\n    />\n    <img\n        class=\"hidden dark:block\"\n        src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Agent_ManimCE.gif\"\n    />\n</div>\n\nWe implement two versions of agents:\n- [`CodeAgent`] generates its tool calls as Python code snippets.\n- [`ToolCallingAgent`] writes its tool calls as JSON, as is common in many frameworks. Depending on your needs, either approach can be used. For instance, web browsing often requires waiting after each page interaction, so JSON tool calls can fit well.\n\n> [!TIP]\n> Read [Open-source LLMs as LangChain Agents](https://huggingface.co/blog/open-source-llms-as-agents) blog post to learn more about multi-step agents.\n"
  },
  {
    "path": "docs/source/en/examples/async_agent.md",
    "content": "# Async Applications with Agents\n\nThis guide demonstrates how to integrate a synchronous agent from the `smolagents` library into an asynchronous Python web application using Starlette.\nThe example is designed to help users new to async Python and agent integration understand best practices for combining synchronous agent logic with async web servers.\n\n## Overview\n\n- **Starlette**: A lightweight ASGI framework for building asynchronous web applications in Python.\n- **anyio.to_thread.run_sync**: Utility to run blocking (synchronous) code in a background thread, preventing it from blocking the async event loop.\n- **CodeAgent**: An agent from the `smolagents` library capable of programmatically solving tasks.\n\n## Why Use a Background Thread?\n\n`CodeAgent.run()` executes Python code synchronously. If called directly in an async endpoint, it would block Starlette's event loop, reducing performance and scalability. By offloading this operation to a background thread with `anyio.to_thread.run_sync`, you keep the app responsive and efficient, even under high concurrency.\n\n## Example Workflow\n\n- The Starlette app exposes a `/run-agent` endpoint that accepts a JSON payload with a `task` string.\n- When a request is received, the agent is run in a background thread using `anyio.to_thread.run_sync`.\n- The result is returned as a JSON response.\n\n## Building a Starlette App with a CodeAgent\n\n### 1. Install Dependencies\n\n```bash\npip install smolagents starlette anyio uvicorn\n```\n\n### 2. Application Code (`main.py`)\n\n```python\nimport anyio.to_thread\nfrom starlette.applications import Starlette\nfrom starlette.requests import Request\nfrom starlette.responses import JSONResponse\nfrom starlette.routing import Route\n\nfrom smolagents import CodeAgent, InferenceClientModel\n\nagent = CodeAgent(\n    model=InferenceClientModel(model_id=\"Qwen/Qwen3-Next-80B-A3B-Thinking\"),\n    tools=[],\n)\n\nasync def run_agent(request: Request):\n    data = await request.json()\n    task = data.get(\"task\", \"\")\n    # Run the agent synchronously in a background thread\n    result = await anyio.to_thread.run_sync(agent.run, task)\n    return JSONResponse({\"result\": result})\n\napp = Starlette(routes=[\n    Route(\"/run-agent\", run_agent, methods=[\"POST\"]),\n])\n```\n\n### 3. Run the App\n\n```bash\nuvicorn async_agent.main:app --reload\n```\n\n### 4. Test the Endpoint\n\n```bash\ncurl -X POST http://localhost:8000/run-agent -H 'Content-Type: application/json' -d '{\"task\": \"What is 2+2?\"}'\n```\n\n**Expected Response:**\n\n```json\n{\"result\": \"4\"}\n```\n\n## Further Reading\n\n- [Starlette Documentation](https://www.starlette.io/)\n- [anyio Documentation](https://anyio.readthedocs.io/)\n\n---\n\nFor the full code, see [`examples/async_agent`](https://github.com/huggingface/smolagents/tree/main/examples/async_agent).\n"
  },
  {
    "path": "docs/source/en/examples/multiagents.md",
    "content": "# Orchestrate a multi-agent system 🤖🤝🤖\n\n[[open-in-colab]]\n\nIn this notebook we will make a **multi-agent web browser: an agentic system with several agents collaborating to solve problems using the web!**\n\nIt will be a simple hierarchy:\n\n```\n              +----------------+\n              | Manager agent  |\n              +----------------+\n                       |\n        _______________|______________\n       |                              |\nCode Interpreter            +------------------+\n    tool                    | Web Search agent |\n                            +------------------+\n                               |            |\n                        Web Search tool     |\n                                   Visit webpage tool\n```\nLet's set up this system. \n\nRun the line below to install the required dependencies:\n\n```py\n!pip install 'smolagents[toolkit]' --upgrade -q\n```\n\nLet's login to HF in order to call Inference Providers:\n\n```py\nfrom huggingface_hub import login\n\nlogin()\n```\n\n⚡️ Our agent will be powered by [Qwen/Qwen3-Next-80B-A3B-Thinking](https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Thinking) using `InferenceClientModel` class that uses HF's Inference API: the Inference API allows to quickly and easily run any OS model.\n\n> [!TIP]\n> Inference Providers give access to hundreds of models, powered by serverless inference partners. A list of supported providers can be found [here](https://huggingface.co/docs/inference-providers/index).\n\n```py\nmodel_id = \"Qwen/Qwen3-Next-80B-A3B-Thinking\"\n```\n\n## 🔍 Create a web search tool\n\nFor web browsing, we can already use our native [`WebSearchTool`] tool to provide a Google search equivalent.\n\nBut then we will also need to be able to peak into the page found by the `WebSearchTool`.\nTo do so, we could import the library's built-in `VisitWebpageTool`, but we will build it again to see how it's done.\n\nSo let's create our `VisitWebpageTool` tool from scratch using `markdownify`.\n\n```py\nimport re\nimport requests\nfrom markdownify import markdownify\nfrom requests.exceptions import RequestException\nfrom smolagents import tool\n\n\n@tool\ndef visit_webpage(url: str) -> str:\n    \"\"\"Visits a webpage at the given URL and returns its content as a markdown string.\n\n    Args:\n        url: The URL of the webpage to visit.\n\n    Returns:\n        The content of the webpage converted to Markdown, or an error message if the request fails.\n    \"\"\"\n    try:\n        # Send a GET request to the URL\n        response = requests.get(url)\n        response.raise_for_status()  # Raise an exception for bad status codes\n\n        # Convert the HTML content to Markdown\n        markdown_content = markdownify(response.text).strip()\n\n        # Remove multiple line breaks\n        markdown_content = re.sub(r\"\\n{3,}\", \"\\n\\n\", markdown_content)\n\n        return markdown_content\n\n    except RequestException as e:\n        return f\"Error fetching the webpage: {str(e)}\"\n    except Exception as e:\n        return f\"An unexpected error occurred: {str(e)}\"\n```\n\nOk, now let's initialize and test our tool!\n\n```py\nprint(visit_webpage(\"https://en.wikipedia.org/wiki/Hugging_Face\")[:500])\n```\n\n## Build our multi-agent system 🤖🤝🤖\n\nNow that we have all the tools `search` and `visit_webpage`, we can use them to create the web agent.\n\nWhich configuration to choose for this agent?\n- Web browsing is a single-timeline task that does not require parallel tool calls, so JSON tool calling works well for that. We thus choose a `ToolCallingAgent`.\n- Also, since sometimes web search requires exploring many pages before finding the correct answer, we prefer to increase the number of `max_steps` to 10.\n\n```py\nfrom smolagents import (\n    CodeAgent,\n    ToolCallingAgent,\n    InferenceClientModel,\n    WebSearchTool,\n)\n\nmodel = InferenceClientModel(model_id=model_id)\n\nweb_agent = ToolCallingAgent(\n    tools=[WebSearchTool(), visit_webpage],\n    model=model,\n    max_steps=10,\n    name=\"web_search_agent\",\n    description=\"Runs web searches for you.\",\n)\n```\n\nNote that we gave this agent attributes `name` and `description`, mandatory attributes to make this agent callable by its manager agent.\n\nThen we create a manager agent, and upon initialization we pass our managed agent to it in its `managed_agents` argument.\n\nSince this agent is the one tasked with the planning and thinking, advanced reasoning will be beneficial, so a `CodeAgent` will work well.\n\nAlso, we want to ask a question that involves the current year and does additional data calculations: so let us add `additional_authorized_imports=[\"time\", \"numpy\", \"pandas\"]`, just in case the agent needs these packages.\n\n```py\nmanager_agent = CodeAgent(\n    tools=[],\n    model=model,\n    managed_agents=[web_agent],\n    additional_authorized_imports=[\"time\", \"numpy\", \"pandas\"],\n)\n```\n\nThat's all! Now let's run our system! We select a question that requires both some calculation and research:\n\n```py\nanswer = manager_agent.run(\"If LLM training continues to scale up at the current rhythm until 2030, what would be the electric power in GW required to power the biggest training runs by 2030? What would that correspond to, compared to some countries? Please provide a source for any numbers used.\")\n```\n\nWe get this report as the answer:\n```\nBased on current growth projections and energy consumption estimates, if LLM trainings continue to scale up at the \ncurrent rhythm until 2030:\n\n1. The electric power required to power the biggest training runs by 2030 would be approximately 303.74 GW, which \ntranslates to about 2,660,762 GWh/year.\n\n2. Comparing this to countries' electricity consumption:\n   - It would be equivalent to about 34% of China's total electricity consumption.\n   - It would exceed the total electricity consumption of India (184%), Russia (267%), and Japan (291%).\n   - It would be nearly 9 times the electricity consumption of countries like Italy or Mexico.\n\n3. Source of numbers:\n   - The initial estimate of 5 GW for future LLM training comes from AWS CEO Matt Garman.\n   - The growth projection used a CAGR of 79.80% from market research by Springs.\n   - Country electricity consumption data is from the U.S. Energy Information Administration, primarily for the year \n2021.\n```\n\nSeems like we'll need some sizeable powerplants if the [scaling hypothesis](https://gwern.net/scaling-hypothesis) continues to hold true.\n\nOur agents managed to efficiently collaborate towards solving the task! ✅\n\n💡 You can easily extend this orchestration to more agents: one does the code execution, one the web search, one handles file loadings...\n"
  },
  {
    "path": "docs/source/en/examples/plan_customization.md",
    "content": "# Human-in-the-Loop: Customize Agent Plan Interactively\n\nThis page demonstrates advanced usage of the smolagents library, with a special focus on **Human-in-the-Loop (HITL)** approaches for interactive plan creation, user-driven plan modification, and memory preservation in agentic workflows.\nThe example is based on the code in `examples/plan_customization/plan_customization.py`.\n\n## Overview\n\nThis example teaches you how to implement Human-in-the-Loop strategies to:\n\n- Interrupt agent execution after a plan is created (using step callbacks)\n- Allow users to review and modify the agent's plan before execution (Human-in-the-Loop)\n- Resume execution while preserving the agent's memory\n- Dynamically update plans based on user feedback, keeping the human in control\n\n## Key Concepts\n\n### Step Callbacks for Plan Interruption\n\nThe agent is configured to pause after creating a plan. This is achieved by registering a step callback for the `PlanningStep`:\n\n```python\nagent = CodeAgent(\n    model=InferenceClientModel(),\n    tools=[DuckDuckGoSearchTool()],\n    planning_interval=5,  # Plan every 5 steps\n    step_callbacks={PlanningStep: interrupt_after_plan},\n    max_steps=10,\n    verbosity_level=1\n)\n```\n\n### Human-in-the-Loop: Interactive Plan Review and Modification\n\nWhen the agent creates a plan, the callback displays it and prompts the human user to:\n\n1. Approve the plan\n2. Modify the plan\n3. Cancel execution\n\nExample interaction:\n\n```\n============================================================\n🤖 AGENT PLAN CREATED\n============================================================\n1. Search for recent AI developments\n2. Analyze the top results\n3. Summarize the 3 most significant breakthroughs\n4. Include sources for each breakthrough\n============================================================\n\nChoose an option:\n1. Approve plan\n2. Modify plan\n3. Cancel\nYour choice (1-3):\n```\n\nThis Human-in-the-Loop step enables a human to intervene and review or modify the plan before execution continues, and ensures that the agent's actions align with human intent.\n\nIf the user chooses to modify, they can edit the plan directly. The updated plan is then used for subsequent execution steps.\n\n### Memory Preservation and Resuming Execution\n\nBy running the agent with `reset=False`, all previous steps and memory are preserved. This allows you to resume execution after an interruption or plan modification:\n\n```python\n# First run (may be interrupted)\nagent.run(task, reset=True)\n\n# Resume with preserved memory\nagent.run(task, reset=False)\n```\n\n### Inspecting Agent Memory\n\nYou can inspect the agent's memory to see all steps taken so far:\n\n```python\nprint(f\"Current memory contains {len(agent.memory.steps)} steps:\")\nfor i, step in enumerate(agent.memory.steps):\n    step_type = type(step).__name__\n    print(f\"  {i+1}. {step_type}\")\n```\n\n## Example Human-in-the-Loop Workflow\n\n1. Agent starts with a complex task\n2. Planning step is created and execution pauses for human review\n3. Human reviews and optionally modifies the plan (Human-in-the-Loop)\n4. Execution resumes with the approved/modified plan\n5. All steps are preserved for future runs, maintaining transparency and control\n\n## Error Handling\n\nThe example includes error handling for:\n- User cancellation\n- Plan modification errors\n- Resume execution failures\n\n## Requirements\n\n- smolagents library\n- DuckDuckGoSearchTool (included with smolagents)\n- InferenceClientModel (requires HuggingFace API token)\n\n## Educational Value\n\nThis example demonstrates:\n- Step callback implementation for custom agent behavior\n- Memory management in multi-step agents\n- User interaction patterns in agentic systems\n- Plan modification techniques for dynamic agent control\n- Error handling in interactive agent systems\n\n---\n\nFor the full code, see [`examples/plan_customization`](https://github.com/huggingface/smolagents/tree/main/examples/plan_customization).\n"
  },
  {
    "path": "docs/source/en/examples/rag.md",
    "content": "# Agentic RAG\n\n[[open-in-colab]]\n\n## Introduction to Retrieval-Augmented Generation (RAG)\n\nRetrieval-Augmented Generation (RAG) combines the power of large language models with external knowledge retrieval to produce more accurate, factual, and contextually relevant responses. At its core, RAG is about \"using an LLM to answer a user query, but basing the answer on information retrieved from a knowledge base.\"\n\n### Why Use RAG?\n\nRAG offers several significant advantages over using vanilla or fine-tuned LLMs:\n\n1. **Factual Grounding**: Reduces hallucinations by anchoring responses in retrieved facts\n2. **Domain Specialization**: Provides domain-specific knowledge without model retraining\n3. **Knowledge Recency**: Allows access to information beyond the model's training cutoff\n4. **Transparency**: Enables citation of sources for generated content\n5. **Control**: Offers fine-grained control over what information the model can access\n\n### Limitations of Traditional RAG\n\nDespite its benefits, traditional RAG approaches face several challenges:\n\n- **Single Retrieval Step**: If the initial retrieval results are poor, the final generation will suffer\n- **Query-Document Mismatch**: User queries (often questions) may not match well with documents containing answers (often statements)\n- **Limited Reasoning**: Simple RAG pipelines don't allow for multi-step reasoning or query refinement\n- **Context Window Constraints**: Retrieved documents must fit within the model's context window\n\n## Agentic RAG: A More Powerful Approach\n\nWe can overcome these limitations by implementing an **Agentic RAG** system - essentially an agent equipped with retrieval capabilities. This approach transforms RAG from a rigid pipeline into an interactive, reasoning-driven process.\n\n### Key Benefits of Agentic RAG\n\nAn agent with retrieval tools can:\n\n1. ✅ **Formulate optimized queries**: The agent can transform user questions into retrieval-friendly queries\n2. ✅ **Perform multiple retrievals**: The agent can retrieve information iteratively as needed\n3. ✅ **Reason over retrieved content**: The agent can analyze, synthesize, and draw conclusions from multiple sources\n4. ✅ **Self-critique and refine**: The agent can evaluate retrieval results and adjust its approach\n\nThis approach naturally implements advanced RAG techniques:\n- **Hypothetical Document Embedding (HyDE)**: Instead of using the user query directly, the agent formulates retrieval-optimized queries ([paper reference](https://huggingface.co/papers/2212.10496))\n- **Self-Query Refinement**: The agent can analyze initial results and perform follow-up retrievals with refined queries ([technique reference](https://docs.llamaindex.ai/en/stable/examples/evaluation/RetryQuery/))\n\n## Building an Agentic RAG System\n\nLet's build a complete Agentic RAG system step by step. We'll create an agent that can answer questions about the Hugging Face Transformers library by retrieving information from its documentation.\n\nYou can follow along with the code snippets below, or check out the full example in the smolagents GitHub repository: [examples/rag.py](https://github.com/huggingface/smolagents/blob/main/examples/rag.py).\n\n### Step 1: Install Required Dependencies\n\nFirst, we need to install the necessary packages:\n\n```bash\npip install smolagents pandas langchain langchain-community sentence-transformers datasets python-dotenv rank_bm25 --upgrade\n```\n\nIf you plan to use Hugging Face's Inference API, you'll need to set up your API token:\n\n```python\n# Load environment variables (including HF_TOKEN)\nfrom dotenv import load_dotenv\nload_dotenv()\n```\n\n### Step 2: Prepare the Knowledge Base\n\nWe'll use a dataset containing Hugging Face documentation and prepare it for retrieval:\n\n```python\nimport datasets\nfrom langchain.docstore.document import Document\nfrom langchain.text_splitter import RecursiveCharacterTextSplitter\nfrom langchain_community.retrievers import BM25Retriever\n\n# Load the Hugging Face documentation dataset\nknowledge_base = datasets.load_dataset(\"m-ric/huggingface_doc\", split=\"train\")\n\n# Filter to include only Transformers documentation\nknowledge_base = knowledge_base.filter(lambda row: row[\"source\"].startswith(\"huggingface/transformers\"))\n\n# Convert dataset entries to Document objects with metadata\nsource_docs = [\n    Document(page_content=doc[\"text\"], metadata={\"source\": doc[\"source\"].split(\"/\")[1]})\n    for doc in knowledge_base\n]\n\n# Split documents into smaller chunks for better retrieval\ntext_splitter = RecursiveCharacterTextSplitter(\n    chunk_size=500,  # Characters per chunk\n    chunk_overlap=50,  # Overlap between chunks to maintain context\n    add_start_index=True,\n    strip_whitespace=True,\n    separators=[\"\\n\\n\", \"\\n\", \".\", \" \", \"\"],  # Priority order for splitting\n)\ndocs_processed = text_splitter.split_documents(source_docs)\n\nprint(f\"Knowledge base prepared with {len(docs_processed)} document chunks\")\n```\n\n### Step 3: Create a Retriever Tool\n\nNow we'll create a custom tool that our agent can use to retrieve information from the knowledge base:\n\n```python\nfrom smolagents import Tool\n\nclass RetrieverTool(Tool):\n    name = \"retriever\"\n    description = \"Uses semantic search to retrieve the parts of transformers documentation that could be most relevant to answer your query.\"\n    inputs = {\n        \"query\": {\n            \"type\": \"string\",\n            \"description\": \"The query to perform. This should be semantically close to your target documents. Use the affirmative form rather than a question.\",\n        }\n    }\n    output_type = \"string\"\n\n    def __init__(self, docs, **kwargs):\n        super().__init__(**kwargs)\n        # Initialize the retriever with our processed documents\n        self.retriever = BM25Retriever.from_documents(\n            docs, k=10  # Return top 10 most relevant documents\n        )\n\n    def forward(self, query: str) -> str:\n        \"\"\"Execute the retrieval based on the provided query.\"\"\"\n        assert isinstance(query, str), \"Your search query must be a string\"\n\n        # Retrieve relevant documents\n        docs = self.retriever.invoke(query)\n\n        # Format the retrieved documents for readability\n        return \"\\nRetrieved documents:\\n\" + \"\".join(\n            [\n                f\"\\n\\n===== Document {str(i)} =====\\n\" + doc.page_content\n                for i, doc in enumerate(docs)\n            ]\n        )\n\n# Initialize our retriever tool with the processed documents\nretriever_tool = RetrieverTool(docs_processed)\n```\n\n> [!TIP]\n> We're using BM25, a lexical retrieval method, for simplicity and speed. For production systems, you might want to use semantic search with embeddings for better retrieval quality. Check the [MTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) for high-quality embedding models.\n\n### Step 4: Create an Advanced Retrieval Agent\n\nNow we'll create an agent that can use our retriever tool to answer questions:\n\n```python\nfrom smolagents import InferenceClientModel, CodeAgent\n\n# Initialize the agent with our retriever tool\nagent = CodeAgent(\n    tools=[retriever_tool],  # List of tools available to the agent\n    model=InferenceClientModel(),  # Default model \"Qwen/Qwen3-Next-80B-A3B-Thinking\"\n    max_steps=4,  # Limit the number of reasoning steps\n    verbosity_level=2,  # Show detailed agent reasoning\n)\n\n# To use a specific model, you can specify it like this:\n# model=InferenceClientModel(model_id=\"meta-llama/Llama-3.3-70B-Instruct\")\n```\n\n> [!TIP]\n> Inference Providers give access to hundreds of models, powered by serverless inference partners. A list of supported providers can be found [here](https://huggingface.co/docs/inference-providers/index).\n\n### Step 5: Run the Agent to Answer Questions\n\nLet's use our agent to answer a question about Transformers:\n\n```python\n# Ask a question that requires retrieving information\nquestion = \"For a transformers model training, which is slower, the forward or the backward pass?\"\n\n# Run the agent to get an answer\nagent_output = agent.run(question)\n\n# Display the final answer\nprint(\"\\nFinal answer:\")\nprint(agent_output)\n```\n\n## Practical Applications of Agentic RAG\n\nAgentic RAG systems can be applied to various use cases:\n\n1. **Technical Documentation Assistance**: Help users navigate complex technical documentation\n2. **Research Paper Analysis**: Extract and synthesize information from scientific papers\n3. **Legal Document Review**: Find relevant precedents and clauses in legal documents\n4. **Customer Support**: Answer questions based on product documentation and knowledge bases\n5. **Educational Tutoring**: Provide explanations based on textbooks and learning materials\n\n## Conclusion\n\nAgentic RAG represents a significant advancement over traditional RAG pipelines. By combining the reasoning capabilities of LLM agents with the factual grounding of retrieval systems, we can build more powerful, flexible, and accurate information systems.\n\nThe approach we've demonstrated:\n- Overcomes the limitations of single-step retrieval\n- Enables more natural interactions with knowledge bases\n- Provides a framework for continuous improvement through self-critique and query refinement\n\nAs you build your own Agentic RAG systems, consider experimenting with different retrieval methods, agent architectures, and knowledge sources to find the optimal configuration for your specific use case.\n"
  },
  {
    "path": "docs/source/en/examples/text_to_sql.md",
    "content": "# Text-to-SQL\n\n[[open-in-colab]]\n\nIn this tutorial, we’ll see how to implement an agent that leverages SQL using `smolagents`.\n\n> Let's start with the golden question: why not keep it simple and use a standard text-to-SQL pipeline?\n\nA standard text-to-sql pipeline is brittle, since the generated SQL query can be incorrect. Even worse, the query could be incorrect, but not raise an error, instead giving some incorrect/useless outputs without raising an alarm.\n\n👉 Instead, an agent system is able to critically inspect outputs and decide if the query needs to be changed or not, thus giving it a huge performance boost.\n\nLet’s build this agent! 💪\n\nRun the line below to install required dependencies:\n```bash\n!pip install smolagents python-dotenv sqlalchemy --upgrade -q\n```\nTo call Inference Providers, you will need a valid token as your environment variable `HF_TOKEN`.\nWe use python-dotenv to load it.\n```py\nfrom dotenv import load_dotenv\nload_dotenv()\n```\n\nThen, we setup the SQL environment:\n```py\nfrom sqlalchemy import (\n    create_engine,\n    MetaData,\n    Table,\n    Column,\n    String,\n    Integer,\n    Float,\n    insert,\n    inspect,\n    text,\n)\n\nengine = create_engine(\"sqlite:///:memory:\")\nmetadata_obj = MetaData()\n\ndef insert_rows_into_table(rows, table, engine=engine):\n    for row in rows:\n        stmt = insert(table).values(**row)\n        with engine.begin() as connection:\n            connection.execute(stmt)\n\ntable_name = \"receipts\"\nreceipts = Table(\n    table_name,\n    metadata_obj,\n    Column(\"receipt_id\", Integer, primary_key=True),\n    Column(\"customer_name\", String(16), primary_key=True),\n    Column(\"price\", Float),\n    Column(\"tip\", Float),\n)\nmetadata_obj.create_all(engine)\n\nrows = [\n    {\"receipt_id\": 1, \"customer_name\": \"Alan Payne\", \"price\": 12.06, \"tip\": 1.20},\n    {\"receipt_id\": 2, \"customer_name\": \"Alex Mason\", \"price\": 23.86, \"tip\": 0.24},\n    {\"receipt_id\": 3, \"customer_name\": \"Woodrow Wilson\", \"price\": 53.43, \"tip\": 5.43},\n    {\"receipt_id\": 4, \"customer_name\": \"Margaret James\", \"price\": 21.11, \"tip\": 1.00},\n]\ninsert_rows_into_table(rows, receipts)\n```\n\n### Build our agent\n\nNow let’s make our SQL table retrievable by a tool.\n\nThe tool’s description attribute will be embedded in the LLM’s prompt by the agent system: it gives the LLM information about how to use the tool. This is where we want to describe the SQL table.\n\n```py\ninspector = inspect(engine)\ncolumns_info = [(col[\"name\"], col[\"type\"]) for col in inspector.get_columns(\"receipts\")]\n\ntable_description = \"Columns:\\n\" + \"\\n\".join([f\"  - {name}: {col_type}\" for name, col_type in columns_info])\nprint(table_description)\n```\n\n```text\nColumns:\n  - receipt_id: INTEGER\n  - customer_name: VARCHAR(16)\n  - price: FLOAT\n  - tip: FLOAT\n```\n\nNow let’s build our tool. It needs the following: (read [the tool doc](../tutorials/tools) for more detail)\n- A docstring with an `Args:` part listing arguments.\n- Type hints on both inputs and output.\n\n```py\nfrom smolagents import tool\n\n@tool\ndef sql_engine(query: str) -> str:\n    \"\"\"\n    Allows you to perform SQL queries on the table. Returns a string representation of the result.\n    The table is named 'receipts'. Its description is as follows:\n        Columns:\n        - receipt_id: INTEGER\n        - customer_name: VARCHAR(16)\n        - price: FLOAT\n        - tip: FLOAT\n\n    Args:\n        query: The query to perform. This should be correct SQL.\n    \"\"\"\n    output = \"\"\n    with engine.connect() as con:\n        rows = con.execute(text(query))\n        for row in rows:\n            output += \"\\n\" + str(row)\n    return output\n```\n\nNow let us create an agent that leverages this tool.\n\nWe use the `CodeAgent`, which is smolagents’ main agent class: an agent that writes actions in code and can iterate on previous output according to the ReAct framework.\n\nThe model is the LLM that powers the agent system. `InferenceClientModel` allows you to call LLMs using HF’s Inference API, either via Serverless or Dedicated endpoint, but you could also use any proprietary API.\n\n```py\nfrom smolagents import CodeAgent, InferenceClientModel\n\nagent = CodeAgent(\n    tools=[sql_engine],\n    model=InferenceClientModel(model_id=\"meta-llama/Llama-3.1-8B-Instruct\"),\n)\nagent.run(\"Can you give me the name of the client who got the most expensive receipt?\")\n```\n\n### Level 2: Table joins\n\nNow let’s make it more challenging! We want our agent to handle joins across multiple tables.\n\nSo let’s make a second table recording the names of waiters for each receipt_id!\n\n```py\ntable_name = \"waiters\"\nwaiters = Table(\n    table_name,\n    metadata_obj,\n    Column(\"receipt_id\", Integer, primary_key=True),\n    Column(\"waiter_name\", String(16), primary_key=True),\n)\nmetadata_obj.create_all(engine)\n\nrows = [\n    {\"receipt_id\": 1, \"waiter_name\": \"Corey Johnson\"},\n    {\"receipt_id\": 2, \"waiter_name\": \"Michael Watts\"},\n    {\"receipt_id\": 3, \"waiter_name\": \"Michael Watts\"},\n    {\"receipt_id\": 4, \"waiter_name\": \"Margaret James\"},\n]\ninsert_rows_into_table(rows, waiters)\n```\nSince we changed the table, we update the `SQLExecutorTool` with this table’s description to let the LLM properly leverage information from this table.\n\n```py\nupdated_description = \"\"\"Allows you to perform SQL queries on the table. Beware that this tool's output is a string representation of the execution output.\nIt can use the following tables:\"\"\"\n\ninspector = inspect(engine)\nfor table in [\"receipts\", \"waiters\"]:\n    columns_info = [(col[\"name\"], col[\"type\"]) for col in inspector.get_columns(table)]\n\n    table_description = f\"Table '{table}':\\n\"\n\n    table_description += \"Columns:\\n\" + \"\\n\".join([f\"  - {name}: {col_type}\" for name, col_type in columns_info])\n    updated_description += \"\\n\\n\" + table_description\n\nprint(updated_description)\n```\nSince this request is a bit harder than the previous one, we’ll switch the LLM engine to use the more powerful [Qwen/Qwen3-Next-80B-A3B-Thinking](https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Thinking)!\n\n```py\nsql_engine.description = updated_description\n\nagent = CodeAgent(\n    tools=[sql_engine],\n    model=InferenceClientModel(model_id=\"Qwen/Qwen3-Next-80B-A3B-Thinking\"),\n)\n\nagent.run(\"Which waiter got more total money from tips?\")\n```\nIt directly works! The setup was surprisingly simple, wasn’t it?\n\nThis example is done! We've touched upon these concepts:\n- Building new tools.\n- Updating a tool's description.\n- Switching to a stronger LLM helps agent reasoning.\n\n✅ Now you can go build this text-to-SQL system you’ve always dreamt of! ✨\n"
  },
  {
    "path": "docs/source/en/examples/using_different_models.md",
    "content": "# Using different models\n\n[[open-in-colab]]\n\n`smolagents` provides a flexible framework that allows you to use various language models from different providers.\nThis guide will show you how to use different model types with your agents.\n\n## Available model types\n\n`smolagents` supports several model types out of the box:\n1. [`InferenceClientModel`]: Uses Hugging Face's Inference API to access models\n2. [`TransformersModel`]: Runs models locally using the Transformers library\n3. [`VLLMModel`]: Uses vLLM for fast inference with optimized serving\n4. [`MLXModel`]: Optimized for Apple Silicon devices using MLX\n5. [`LiteLLMModel`]: Provides access to hundreds of LLMs through LiteLLM\n6. [`LiteLLMRouterModel`]: Distributes requests among multiple models\n7. [`OpenAIModel`]: Provides access to any provider that implements an OpenAI-compatible API\n8. [`AzureOpenAIModel`]: Uses Azure's OpenAI service\n9. [`AmazonBedrockModel`]: Connects to AWS Bedrock's API\n\nAll model classes support passing additional keyword arguments (like `temperature`, `max_tokens`, `top_p`, etc.) directly at instantiation time.\nThese parameters are automatically forwarded to the underlying model's completion calls, allowing you to configure model behavior such as creativity, response length, and sampling strategies.\n\n## Using Google Gemini Models\n\nAs explained in the Google Gemini API documentation (https://ai.google.dev/gemini-api/docs/openai),\nGoogle provides an OpenAI-compatible API for Gemini models, allowing you to use the [`OpenAIModel`]\nwith Gemini models by setting the appropriate base URL.\n\nFirst, install the required dependencies:\n```bash\npip install 'smolagents[openai]'\n```\n\nThen, [get a Gemini API key](https://ai.google.dev/gemini-api/docs/api-key) and set it in your code:\n```python\nGEMINI_API_KEY = <YOUR-GEMINI-API-KEY>\n```\n\nNow, you can initialize the Gemini model using the `OpenAIModel` class\nand setting the `api_base` parameter to the Gemini API base URL:\n```python\nfrom smolagents import OpenAIModel\n\nmodel = OpenAIModel(\n    model_id=\"gemini-2.0-flash\",\n    # Google Gemini OpenAI-compatible API base URL\n    api_base=\"https://generativelanguage.googleapis.com/v1beta/openai/\",\n    api_key=GEMINI_API_KEY,\n)\n```\n\n## Using OpenRouter Models\n\nOpenRouter provides access to a wide variety of language models through a unified OpenAI-compatible API.\nYou can use the [`OpenAIModel`] to connect to OpenRouter by setting the appropriate base URL.\n\nFirst, install the required dependencies:\n```bash\npip install 'smolagents[openai]'\n```\n\nThen, [get an OpenRouter API key](https://openrouter.ai/keys) and set it in your code:\n```python\nOPENROUTER_API_KEY = <YOUR-OPENROUTER-API-KEY>\n```\n\nNow, you can initialize any model available on OpenRouter using the `OpenAIModel` class:\n```python\nfrom smolagents import OpenAIModel\n\nmodel = OpenAIModel(\n    # You can use any model ID available on OpenRouter\n    model_id=\"openai/gpt-4o\",\n    # OpenRouter API base URL\n    api_base=\"https://openrouter.ai/api/v1\",\n    api_key=OPENROUTER_API_KEY,\n)\n```\n\n## Using xAI's Grok Models\n\nxAI's Grok models can be accessed through [`LiteLLMModel`].\n\nSome models (such as \"grok-4\" and \"grok-3-mini\") don't support the `stop` parameter, so you'll need to use\n`REMOVE_PARAMETER` to exclude it from API calls.\n\nFirst, install the required dependencies:\n```bash\npip install smolagents[litellm]\n```\n\nThen, [get an xAI API key](https://console.x.ai/) and set it in your code:\n```python\nXAI_API_KEY = <YOUR-XAI-API-KEY>\n```\n\nNow, you can initialize Grok models using the `LiteLLMModel` class and remove the `stop` parameter if applicable:\n```python\nfrom smolagents import LiteLLMModel, REMOVE_PARAMETER\n\n# Using Grok-4\nmodel = LiteLLMModel(\n    model_id=\"xai/grok-4\",\n    api_key=XAI_API_KEY,\n    stop=REMOVE_PARAMETER,  # Remove stop parameter as grok-4 model doesn't support it\n    temperature=0.7\n)\n\n# Or using Grok-3-mini\nmodel_mini = LiteLLMModel(\n    model_id=\"xai/grok-3-mini\",\n    api_key=XAI_API_KEY,\n    stop=REMOVE_PARAMETER,  # Remove stop parameter as grok-3-mini model doesn't support it\n    max_tokens=1000\n)\n```\n"
  },
  {
    "path": "docs/source/en/examples/web_browser.md",
    "content": "# Web Browser Automation with Agents 🤖🌐\n\n[[open-in-colab]]\n\nIn this notebook, we'll create an **agent-powered web browser automation system**! This system can navigate websites, interact with elements, and extract information automatically.\n\nThe agent will be able to:\n\n- [x] Navigate to web pages\n- [x] Click on elements\n- [x] Search within pages\n- [x] Handle popups and modals\n- [x] Extract information\n\nLet's set up this system step by step!\n\nFirst, run these lines to install the required dependencies:\n\n```bash\npip install smolagents selenium helium pillow -q\n```\n\nLet's import our required libraries and set up environment variables:\n\n```python\nfrom io import BytesIO\nfrom time import sleep\n\nimport helium\nfrom dotenv import load_dotenv\nfrom PIL import Image\nfrom selenium import webdriver\nfrom selenium.webdriver.common.by import By\nfrom selenium.webdriver.common.keys import Keys\n\nfrom smolagents import CodeAgent, tool\nfrom smolagents.agents import ActionStep\n\n# Load environment variables\nload_dotenv()\n```\n\nNow let's create our core browser interaction tools that will allow our agent to navigate and interact with web pages:\n\n```python\n@tool\ndef search_item_ctrl_f(text: str, nth_result: int = 1) -> str:\n    \"\"\"\n    Searches for text on the current page via Ctrl + F and jumps to the nth occurrence.\n    Args:\n        text: The text to search for\n        nth_result: Which occurrence to jump to (default: 1)\n    \"\"\"\n    elements = driver.find_elements(By.XPATH, f\"//*[contains(text(), '{text}')]\")\n    if nth_result > len(elements):\n        raise Exception(f\"Match n°{nth_result} not found (only {len(elements)} matches found)\")\n    result = f\"Found {len(elements)} matches for '{text}'.\"\n    elem = elements[nth_result - 1]\n    driver.execute_script(\"arguments[0].scrollIntoView(true);\", elem)\n    result += f\"Focused on element {nth_result} of {len(elements)}\"\n    return result\n\n@tool\ndef go_back() -> None:\n    \"\"\"Goes back to previous page.\"\"\"\n    driver.back()\n\n@tool\ndef close_popups() -> str:\n    \"\"\"\n    Closes any visible modal or pop-up on the page. Use this to dismiss pop-up windows!\n    This does not work on cookie consent banners.\n    \"\"\"\n    webdriver.ActionChains(driver).send_keys(Keys.ESCAPE).perform()\n```\n\nLet's set up our browser with Chrome and configure screenshot capabilities:\n\n```python\n# Configure Chrome options\nchrome_options = webdriver.ChromeOptions()\nchrome_options.add_argument(\"--force-device-scale-factor=1\")\nchrome_options.add_argument(\"--window-size=1000,1350\")\nchrome_options.add_argument(\"--disable-pdf-viewer\")\nchrome_options.add_argument(\"--window-position=0,0\")\n\n# Initialize the browser\ndriver = helium.start_chrome(headless=False, options=chrome_options)\n\n# Set up screenshot callback\ndef save_screenshot(memory_step: ActionStep, agent: CodeAgent) -> None:\n    sleep(1.0)  # Let JavaScript animations happen before taking the screenshot\n    driver = helium.get_driver()\n    current_step = memory_step.step_number\n    if driver is not None:\n        for previous_memory_step in agent.memory.steps:  # Remove previous screenshots for lean processing\n            if isinstance(previous_memory_step, ActionStep) and previous_memory_step.step_number <= current_step - 2:\n                previous_memory_step.observations_images = None\n        png_bytes = driver.get_screenshot_as_png()\n        image = Image.open(BytesIO(png_bytes))\n        print(f\"Captured a browser screenshot: {image.size} pixels\")\n        memory_step.observations_images = [image.copy()]  # Create a copy to ensure it persists\n\n    # Update observations with current URL\n    url_info = f\"Current url: {driver.current_url}\"\n    memory_step.observations = (\n        url_info if memory_step.observations is None else memory_step.observations + \"\\n\" + url_info\n    )\n```\n\nNow let's create our web automation agent:\n\n```python\nfrom smolagents import InferenceClientModel\n\n# Initialize the model\nmodel_id = \"Qwen/Qwen2-VL-72B-Instruct\"  # You can change this to your preferred VLM model\nmodel = InferenceClientModel(model_id=model_id)\n\n# Create the agent\nagent = CodeAgent(\n    tools=[go_back, close_popups, search_item_ctrl_f],\n    model=model,\n    additional_authorized_imports=[\"helium\"],\n    step_callbacks=[save_screenshot],\n    max_steps=20,\n    verbosity_level=2,\n)\n\n# Import helium for the agent\nagent.python_executor(\"from helium import *\", agent.state)\n```\n\nThe agent needs instructions on how to use Helium for web automation. Here are the instructions we'll provide:\n\n```python\nhelium_instructions = \"\"\"\nYou can use helium to access websites. Don't bother about the helium driver, it's already managed.\nWe've already ran \"from helium import *\"\nThen you can go to pages!\nCode:\n```py\ngo_to('github.com/trending')\n```<end_code>\n\nYou can directly click clickable elements by inputting the text that appears on them.\nCode:\n```py\nclick(\"Top products\")\n```<end_code>\n\nIf it's a link:\nCode:\n```py\nclick(Link(\"Top products\"))\n```<end_code>\n\nIf you try to interact with an element and it's not found, you'll get a LookupError.\nIn general stop your action after each button click to see what happens on your screenshot.\nNever try to login in a page.\n\nTo scroll up or down, use scroll_down or scroll_up with as an argument the number of pixels to scroll from.\nCode:\n```py\nscroll_down(num_pixels=1200) # This will scroll one viewport down\n```<end_code>\n\nWhen you have pop-ups with a cross icon to close, don't try to click the close icon by finding its element or targeting an 'X' element (this most often fails).\nJust use your built-in tool `close_popups` to close them:\nCode:\n```py\nclose_popups()\n```<end_code>\n\nYou can use .exists() to check for the existence of an element. For example:\nCode:\n```py\nif Text('Accept cookies?').exists():\n    click('I accept')\n```<end_code>\n\"\"\"\n```\n\nNow we can run our agent with a task! Let's try finding information on Wikipedia:\n\n```python\nsearch_request = \"\"\"\nPlease navigate to https://en.wikipedia.org/wiki/Chicago and give me a sentence containing the word \"1992\" that mentions a construction accident.\n\"\"\"\n\nagent_output = agent.run(search_request + helium_instructions)\nprint(\"Final output:\")\nprint(agent_output)\n```\n\nYou can run different tasks by modifying the request. For example, here's for me to know if I should work harder:\n\n```python\ngithub_request = \"\"\"\nI'm trying to find how hard I have to work to get a repo in github.com/trending.\nCan you navigate to the profile for the top author of the top trending repo, and give me their total number of commits over the last year?\n\"\"\"\n\nagent_output = agent.run(github_request + helium_instructions)\nprint(\"Final output:\")\nprint(agent_output)\n```\n\nThe system is particularly effective for tasks like:\n- Data extraction from websites\n- Web research automation\n- UI testing and verification\n- Content monitoring"
  },
  {
    "path": "docs/source/en/guided_tour.md",
    "content": "# Agents - Guided tour\n\n[[open-in-colab]]\n\nIn this guided visit, you will learn how to build an agent, how to run it, and how to customize it to make it work better for your use-case.\n\n## Choosing an agent type: CodeAgent or ToolCallingAgent\n\n`smolagents` comes with two agent classes: [`CodeAgent`] and [`ToolCallingAgent`], which represent two different paradigms for how agents interact with tools.\nThe key difference lies in how actions are specified and executed: code generation vs structured tool calling.\n\n- [`CodeAgent`] generates tool calls as Python code snippets.\n  - The code is executed either locally (potentially unsecure) or in a secure sandbox.\n  - Tools are exposed as Python functions (via bindings).\n  - Example of tool call:\n    ```py\n    result = search_docs(\"What is the capital of France?\")\n    print(result)\n    ```\n  - Strengths:\n    - Highly expressive: Allows for complex logic and control flow and can combine tools, loop, transform, reason.\n    - Flexible: No need to predefine every possible action, can dynamically generate new actions/tools.\n    - Emergent reasoning: Ideal for multi-step problems or dynamic logic.\n  - Limitations\n    - Risk of errors: Must handle syntax errors, exceptions.\n    - Less predictable: More prone to unexpected or unsafe outputs.\n    - Requires secure execution environment.\n\n- [`ToolCallingAgent`] writes tool calls as structured JSON.\n  - This is the common format used in many frameworks (OpenAI API), allowing for structured tool interactions without code execution.\n  - Tools are defined with a JSON schema: name, description, parameter types, etc.\n  - Example of tool call:\n    ```json\n    {\n      \"tool_call\": {\n        \"name\": \"search_docs\",\n        \"arguments\": {\n          \"query\": \"What is the capital of France?\"\n        }\n      }\n    }\n    ```\n  - Strengths:\n    - Reliable: Less prone to hallucination, outputs are structured and validated.\n    - Safe: Arguments are strictly validated, no risk of arbitrary code running.\n    - Interoperable: Easy to map to external APIs or services.\n  - Limitations:\n    - Low expressivity: Can't easily combine or transform results dynamically, or perform complex logic or control flow.\n    - Inflexible: Must define all possible actions in advance, limited to predefined tools.\n    - No code synthesis: Limited to tool capabilities.\n\nWhen to use which agent type:\n- Use [`CodeAgent`] when:\n  - You need reasoning, chaining, or dynamic composition.\n  - Tools are functions that can be combined (e.g., parsing + math + querying).\n  - Your agent is a problem solver or programmer.\n\n- Use [`ToolCallingAgent`] when:\n  - You have simple, atomic tools (e.g., call an API, fetch a document).\n  - You want high reliability and clear validation.\n  - Your agent is like a dispatcher or controller.\n\n## CodeAgent\n\n[`CodeAgent`] generates Python code snippets to perform actions and solve tasks.\n\nBy default, the Python code execution is done in your local environment.\nThis should be safe because the only functions that can be called are the tools you provided (especially if it's only tools by Hugging Face) and a set of predefined safe functions like `print` or functions from the `math` module, so you're already limited in what can be executed.\n\nThe Python interpreter also doesn't allow imports by default outside of a safe list, so all the most obvious attacks shouldn't be an issue.\nYou can authorize additional imports by passing the authorized modules as a list of strings in argument `additional_authorized_imports` upon initialization of your [`CodeAgent`]:\n\n```py\nmodel = InferenceClientModel()\nagent = CodeAgent(tools=[], model=model, additional_authorized_imports=['requests', 'bs4'])\nagent.run(\"Could you get me the title of the page at url 'https://huggingface.co/blog'?\")\n```\n\nAdditionally, as an extra security layer, access to submodule is forbidden by default, unless explicitly authorized within the import list.\nFor instance, to access the `numpy.random` submodule, you need to add `'numpy.random'` to the `additional_authorized_imports` list.\nThis could also be authorized by using `numpy.*`, which will allow `numpy` as well as any subpackage like `numpy.random` and its own subpackages.\n\n> [!WARNING]\n> The LLM can generate arbitrary code that will then be executed: do not add any unsafe imports!\n\nThe execution will stop at any code trying to perform an illegal operation or if there is a regular Python error with the code generated by the agent.\n\nYou can also use [Blaxel](https://blaxel.ai), [E2B](https://e2b.dev/docs#what-is-e2-b), or Docker instead of a local Python interpreter. For Blaxel, first [set the `BL_API_KEY` and `BL_WORKSPACE` environment variables](https://app.blaxel.ai/profile/security) and then pass `executor_type=\"blaxel\"` upon agent initialization. For E2B, first [set the `E2B_API_KEY` environment variable](https://e2b.dev/dashboard?tab=keys) and then pass `executor_type=\"e2b\"`. For Docker, pass `executor_type=\"docker\"`.\n\n\n> [!TIP]\n> Learn more about code execution [in this tutorial](tutorials/secure_code_execution).\n\n### ToolCallingAgent\n\n[`ToolCallingAgent`] outputs JSON tool calls, which is the common format used in many frameworks (OpenAI API), allowing for structured tool interactions without code execution. We utilize the built-in WebSearchTool (from the smolagents toolkit extra, which will be described in more detail later) to enable our agent to perform web searches.   \n\nIt works much in the same way like [`CodeAgent`], of course without `additional_authorized_imports` since it doesn't execute code:\n\n```py\nfrom smolagents import ToolCallingAgent, WebSearchTool\n\nagent = ToolCallingAgent(tools=[WebSearchTool()], model=model)\nagent.run(\"Could you get me the title of the page at url 'https://huggingface.co/blog'?\")\n```\n\n## Using the CLI\n\nYou can quickly get started with smolagents using the command line interface:\n\n```bash\n# Run with direct prompt and options\nsmolagent \"Plan a trip to Tokyo, Kyoto and Osaka between Mar 28 and Apr 7.\"  --model-type \"InferenceClientModel\" --model-id \"Qwen/Qwen2.5-Coder-32B-Instruct\" --imports \"pandas numpy\" --tools \"web_search\"\n\n# Run in interactive mode: launches when no prompt is provided, will guide you through argument selection\nsmolagent\n```\n\n## Building your agent\n\nTo initialize a minimal agent, you need at least these two arguments:\n\n- `model`, a text-generation model to power your agent - because the agent is different from a simple LLM, it is a system that uses a LLM as its engine. You can use any of these options:\n    - [`TransformersModel`] takes a pre-initialized `transformers` pipeline to run inference on your local machine using `transformers`.\n    - [`InferenceClientModel`] leverages a `huggingface_hub.InferenceClient` under the hood and supports all Inference Providers on the Hub: Cerebras, Cohere, Fal, Fireworks, HF-Inference, Hyperbolic, Nebius, Novita, Replicate, SambaNova, Together, and more.\n    - [`LiteLLMModel`] similarly lets you call 100+ different models and providers through [LiteLLM](https://docs.litellm.ai/)!\n    - [`AzureOpenAIModel`] allows you to use OpenAI models deployed in [Azure](https://azure.microsoft.com/en-us/products/ai-services/openai-service).\n    - [`AmazonBedrockModel`] allows you to use Amazon Bedrock in [AWS](https://aws.amazon.com/bedrock/?nc1=h_ls).\n    - [`MLXModel`] creates a [mlx-lm](https://pypi.org/project/mlx-lm/) pipeline to run inference on your local machine.\n\n- `tools`, a list of `Tools` that the agent can use to solve the task. It can be an empty list. You can also add the default toolbox on top of your `tools` list by defining the optional argument `add_base_tools=True`.\n\nOnce you have these two arguments, `tools` and `model`,  you can create an agent and run it. You can use any LLM you'd like, either through [Inference Providers](https://huggingface.co/blog/inference-providers), [transformers](https://github.com/huggingface/transformers/), [ollama](https://ollama.com/), [LiteLLM](https://www.litellm.ai/), [Azure OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service), [Amazon Bedrock](https://aws.amazon.com/bedrock/?nc1=h_ls), or [mlx-lm](https://pypi.org/project/mlx-lm/).\n\nAll model classes support passing additional keyword arguments (like `temperature`, `max_tokens`, `top_p`, etc.) directly at instantiation time.\nThese parameters are automatically forwarded to the underlying model's completion calls, allowing you to configure model behavior such as creativity, response length, and sampling strategies.\n\n<hfoptions id=\"Pick a LLM\">\n<hfoption id=\"Inference Providers\">\n\nInference Providers need a `HF_TOKEN` to authenticate, but a free HF account already comes with included credits. Upgrade to PRO to raise your included credits.\n\nTo access gated models or rise your rate limits with a PRO account, you need to set the environment variable `HF_TOKEN` or pass `token` variable upon initialization of `InferenceClientModel`. You can get your token from your [settings page](https://huggingface.co/settings/tokens)\n\n```python\nfrom smolagents import CodeAgent, InferenceClientModel\n\nmodel_id = \"meta-llama/Llama-3.3-70B-Instruct\"\n\nmodel = InferenceClientModel(model_id=model_id, token=\"<YOUR_HUGGINGFACEHUB_API_TOKEN>\") # You can choose to not pass any model_id to InferenceClientModel to use a default model\n# you can also specify a particular provider e.g. provider=\"together\" or provider=\"sambanova\"\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n</hfoption>\n<hfoption id=\"Local Transformers Model\">\n\n```python\n# !pip install 'smolagents[transformers]'\nfrom smolagents import CodeAgent, TransformersModel\n\nmodel_id = \"meta-llama/Llama-3.2-3B-Instruct\"\n\nmodel = TransformersModel(model_id=model_id)\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n</hfoption>\n<hfoption id=\"OpenAI or Anthropic API\">\n\nTo use `LiteLLMModel`, you need to set the environment variable `ANTHROPIC_API_KEY` or `OPENAI_API_KEY`, or pass `api_key` variable upon initialization.\n\n```python\n# !pip install 'smolagents[litellm]'\nfrom smolagents import CodeAgent, LiteLLMModel\n\nmodel = LiteLLMModel(model_id=\"anthropic/claude-3-5-sonnet-latest\", api_key=\"YOUR_ANTHROPIC_API_KEY\") # Could use 'gpt-4o'\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n</hfoption>\n<hfoption id=\"Ollama\">\n\n```python\n# !pip install 'smolagents[litellm]'\nfrom smolagents import CodeAgent, LiteLLMModel\n\nmodel = LiteLLMModel(\n    model_id=\"ollama_chat/llama3.2\", # This model is a bit weak for agentic behaviours though\n    api_base=\"http://localhost:11434\", # replace with 127.0.0.1:11434 or remote open-ai compatible server if necessary\n    api_key=\"YOUR_API_KEY\", # replace with API key if necessary\n    num_ctx=8192, # ollama default is 2048 which will fail horribly. 8192 works for easy tasks, more is better. Check https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator to calculate how much VRAM this will need for the selected model.\n)\n\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n</hfoption>\n<hfoption id=\"Azure OpenAI\">\n\nTo connect to Azure OpenAI, you can either use `AzureOpenAIModel` directly, or use `LiteLLMModel` and configure it accordingly.\n\nTo initialize an instance of `AzureOpenAIModel`, you need to pass your model deployment name and then either pass the `azure_endpoint`, `api_key`, and `api_version` arguments, or set the environment variables `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_API_KEY`, and `OPENAI_API_VERSION`.\n\n```python\n# !pip install 'smolagents[openai]'\nfrom smolagents import CodeAgent, AzureOpenAIModel\n\nmodel = AzureOpenAIModel(model_id=\"gpt-4o-mini\")\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n\nSimilarly, you can configure `LiteLLMModel` to connect to Azure OpenAI as follows:\n\n- pass your model deployment name as `model_id`, and make sure to prefix it with `azure/`\n- make sure to set the environment variable `AZURE_API_VERSION`\n- either pass the `api_base` and `api_key` arguments, or set the environment variables `AZURE_API_KEY`, and `AZURE_API_BASE`\n\n```python\nimport os\nfrom smolagents import CodeAgent, LiteLLMModel\n\nAZURE_OPENAI_CHAT_DEPLOYMENT_NAME=\"gpt-35-turbo-16k-deployment\" # example of deployment name\n\nos.environ[\"AZURE_API_KEY\"] = \"\" # api_key\nos.environ[\"AZURE_API_BASE\"] = \"\" # \"https://example-endpoint.openai.azure.com\"\nos.environ[\"AZURE_API_VERSION\"] = \"\" # \"2024-10-01-preview\"\n\nmodel = LiteLLMModel(model_id=\"azure/\" + AZURE_OPENAI_CHAT_DEPLOYMENT_NAME)\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n   \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n\n</hfoption>\n<hfoption id=\"Amazon Bedrock\">\n\nThe `AmazonBedrockModel` class provides native integration with Amazon Bedrock, allowing for direct API calls and comprehensive configuration.\n\nBasic Usage:\n\n```python\n# !pip install 'smolagents[bedrock]'\nfrom smolagents import CodeAgent, AmazonBedrockModel\n\nmodel = AmazonBedrockModel(model_id=\"anthropic.claude-3-sonnet-20240229-v1:0\")\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n\nAdvanced Configuration:\n\n```python\nimport boto3\nfrom smolagents import AmazonBedrockModel\n\n# Create a custom Bedrock client\nbedrock_client = boto3.client(\n    'bedrock-runtime',\n    region_name='us-east-1',\n    aws_access_key_id='YOUR_ACCESS_KEY',\n    aws_secret_access_key='YOUR_SECRET_KEY'\n)\n\nadditional_api_config = {\n    \"inferenceConfig\": {\n        \"maxTokens\": 3000\n    },\n    \"guardrailConfig\": {\n        \"guardrailIdentifier\": \"identify1\",\n        \"guardrailVersion\": 'v1'\n    },\n}\n\n# Initialize with comprehensive configuration\nmodel = AmazonBedrockModel(\n    model_id=\"us.amazon.nova-pro-v1:0\",\n    client=bedrock_client,  # Use custom client\n    **additional_api_config\n)\n\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n\nUsing LiteLLMModel:\n\nAlternatively, you can use `LiteLLMModel` with Bedrock models:\n\n```python\nfrom smolagents import LiteLLMModel, CodeAgent\n\nmodel = LiteLLMModel(model_name=\"bedrock/anthropic.claude-3-sonnet-20240229-v1:0\")\nagent = CodeAgent(tools=[], model=model)\n\nagent.run(\"Explain the concept of quantum computing\")\n```\n\n</hfoption>\n<hfoption id=\"mlx-lm\">\n\n```python\n# !pip install 'smolagents[mlx-lm]'\nfrom smolagents import CodeAgent, MLXModel\n\nmlx_model = MLXModel(\"mlx-community/Qwen2.5-Coder-32B-Instruct-4bit\")\nagent = CodeAgent(model=mlx_model, tools=[], add_base_tools=True)\n\nagent.run(\"Could you give me the 118th number in the Fibonacci sequence?\")\n```\n\n</hfoption>\n</hfoptions>\n\n### Model parameter management\n\nWhen initializing models, you can pass keyword arguments that will be forwarded as completion parameters to the\nunderlying model API during inference.\n\nFor fine-grained control over parameter handling, the `REMOVE_PARAMETER` sentinel value allows you to explicitly exclude\nparameters that might otherwise be set by default or passed through elsewhere:\n\n```python\nfrom smolagents import OpenAIModel, REMOVE_PARAMETER\n\n# Remove \"stop\" parameter\nmodel = OpenAIModel(\n    model_id=\"gpt-5\",\n    stop=REMOVE_PARAMETER,  # Ensures \"stop\" is not included in API calls\n    temperature=0.7\n)\n\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n```\n\nThis is particularly useful when:\n- You want to override default parameters that might be applied automatically\n- You need to ensure certain parameters are completely excluded from API calls\n- You want to let the model provider use their own defaults for specific parameters\n\n## Advanced agent configuration\n\n### Customizing agent termination conditions\n\nBy default, an agent continues running until it calls the `final_answer` function or reaches the maximum number of steps.\nThe `final_answer_checks` parameter gives you more control over when and how an agent terminates its execution:\n\n```python\nfrom smolagents import CodeAgent, InferenceClientModel\n\n# Define a custom final answer check function\ndef is_integer(final_answer: str, agent_memory=None) -> bool:\n    \"\"\"Return True if final_answer is an integer.\"\"\"\n    try:\n        int(final_answer)\n        return True\n    except ValueError:\n        return False\n\n# Initialize agent with custom final answer check\nagent = CodeAgent(\n    tools=[],\n    model=InferenceClientModel(),\n    final_answer_checks=[is_integer]\n)\n\nagent.run(\"Calculate the least common multiple of 3 and 7\")\n```\n\nThe `final_answer_checks` parameter accepts a list of functions that each:\n- Take the agent's final_answer and the agent itself as parameters\n- Return a boolean indicating whether the final_answer is valid (True) or not (False)\n\nIf any function returns `False`, the agent will log the error message and continue the run.\nThis validation mechanism enables:\n- Enforcing output format requirements (e.g., ensuring numeric answers for math problems)\n- Implementing domain-specific validation rules\n- Creating more robust agents that validate their own outputs\n\n## Inspecting an agent run\n\nHere are a few useful attributes to inspect what happened after a run:\n- `agent.logs` stores the fine-grained logs of the agent. At every step of the agent's run, everything gets stored in a dictionary that then is appended to `agent.logs`.\n- Running `agent.write_memory_to_messages()` writes the agent's memory as list of chat messages for the Model to view. This method goes over each step of the log and only stores what it's interested in as a message: for instance, it will save the system prompt and task in separate messages, then for each step it will store the LLM output as a message, and the tool call output as another message. Use this if you want a higher-level view of what has happened - but not every log will be transcripted by this method.\n\n## Tools\n\nA tool is an atomic function to be used by an agent. To be used by an LLM, it also needs a few attributes that constitute its API and will be used to describe to the LLM how to call this tool:\n- A name\n- A description\n- Input types and descriptions\n- An output type\n\nYou can for instance check the [`PythonInterpreterTool`]: it has a name, a description, input descriptions, an output type, and a `forward` method to perform the action.\n\nWhen the agent is initialized, the tool attributes are used to generate a tool description which is baked into the agent's system prompt. This lets the agent know which tools it can use and why.\n\n**Schema Information**: For tools that have an `output_schema` defined (such as MCP tools with structured output), the `CodeAgent` system prompt automatically includes the JSON schema information. This helps the agent understand the expected structure of tool outputs and access the data appropriately.\n\n### Default toolbox\n\nIf you install `smolagents` with the \"toolkit\" extra, it comes with a default toolbox for empowering agents, that you can add to your agent upon initialization with argument `add_base_tools=True`:\n\n- **DuckDuckGo web search***: performs a web search using DuckDuckGo browser.\n- **Python code interpreter**: runs your LLM generated Python code in a secure environment. This tool will only be added to [`ToolCallingAgent`] if you initialize it with `add_base_tools=True`, since code-based agent can already natively execute Python code\n- **Transcriber**: a speech-to-text pipeline built on Whisper-Turbo that transcribes an audio to text.\n\nYou can manually use a tool by calling it with its arguments.\n\n```python\n# !pip install 'smolagents[toolkit]'\nfrom smolagents import WebSearchTool\n\nsearch_tool = WebSearchTool()\nprint(search_tool(\"Who's the current president of Russia?\"))\n```\n\n### Create a new tool\n\nYou can create your own tool for use cases not covered by the default tools from Hugging Face.\nFor example, let's create a tool that returns the most downloaded model for a given task from the Hub.\n\nYou'll start with the code below.\n\n```python\nfrom huggingface_hub import list_models\n\ntask = \"text-classification\"\n\nmost_downloaded_model = next(iter(list_models(filter=task, sort=\"downloads\", direction=-1)))\nprint(most_downloaded_model.id)\n```\n\nThis code can quickly be converted into a tool, just by wrapping it in a function and adding the `tool` decorator:\nThis is not the only way to build the tool: you can directly define it as a subclass of [`Tool`], which gives you more flexibility, for instance the possibility to initialize heavy class attributes.\n\nLet's see how it works for both options:\n\n<hfoptions id=\"build-a-tool\">\n<hfoption id=\"Decorate a function with @tool\">\n\n```py\nfrom smolagents import tool\n\n@tool\ndef model_download_tool(task: str) -> str:\n    \"\"\"\n    This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub.\n    It returns the name of the checkpoint.\n\n    Args:\n        task: The task for which to get the download count.\n    \"\"\"\n    most_downloaded_model = next(iter(list_models(filter=task, sort=\"downloads\", direction=-1)))\n    return most_downloaded_model.id\n```\n\nThe function needs:\n- A clear name. The name should be descriptive enough of what this tool does to help the LLM brain powering the agent. Since this tool returns the model with the most downloads for a task, let's name it `model_download_tool`.\n- Type hints on both inputs and output\n- A description, that includes an 'Args:' part where each argument is described (without a type indication this time, it will be pulled from the type hint). Same as for the tool name, this description is an instruction manual for the LLM powering your agent, so do not neglect it.\n\nAll these elements will be automatically baked into the agent's system prompt upon initialization: so strive to make them as clear as possible!\n\n> [!TIP]\n> This definition format is the same as tool schemas used in `apply_chat_template`, the only difference is the added `tool` decorator: read more on our tool use API [here](https://huggingface.co/blog/unified-tool-use#passing-tools-to-a-chat-template).\n\n\nThen you can directly initialize your agent:\n```py\nfrom smolagents import CodeAgent, InferenceClientModel\nagent = CodeAgent(tools=[model_download_tool], model=InferenceClientModel())\nagent.run(\n    \"Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub?\"\n)\n```\n</hfoption>\n<hfoption id=\"Subclass Tool\">\n\n```py\nfrom smolagents import Tool\n\nclass ModelDownloadTool(Tool):\n    name = \"model_download_tool\"\n    description = \"This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub. It returns the name of the checkpoint.\"\n    inputs = {\"task\": {\"type\": \"string\", \"description\": \"The task for which to get the download count.\"}}\n    output_type = \"string\"\n\n    def forward(self, task: str) -> str:\n        most_downloaded_model = next(iter(list_models(filter=task, sort=\"downloads\", direction=-1)))\n        return most_downloaded_model.id\n```\n\nThe subclass needs the following attributes:\n- A clear `name`. The name should be descriptive enough of what this tool does to help the LLM brain powering the agent. Since this tool returns the model with the most downloads for a task, let's name it `model_download_tool`.\n- A `description`. Same as for the `name`, this description is an instruction manual for the LLM powering your agent, so do not neglect it.\n- Input types and descriptions\n- Output type\nAll these attributes will be automatically baked into the agent's system prompt upon initialization: so strive to make them as clear as possible!\n\n\nThen you can directly initialize your agent:\n```py\nfrom smolagents import CodeAgent, InferenceClientModel\nagent = CodeAgent(tools=[ModelDownloadTool()], model=InferenceClientModel())\nagent.run(\n    \"Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub?\"\n)\n```\n</hfoption>\n</hfoptions>\n\nYou get the following logs:\n```text\n╭──────────────────────────────────────── New run ─────────────────────────────────────────╮\n│                                                                                          │\n│ Can you give me the name of the model that has the most downloads in the 'text-to-video' │\n│ task on the Hugging Face Hub?                                                            │\n│                                                                                          │\n╰─ InferenceClientModel - Qwen/Qwen2.5-Coder-32B-Instruct ───────────────────────────────────────────╯\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n╭─ Executing this code: ───────────────────────────────────────────────────────────────────╮\n│   1 model_name = model_download_tool(task=\"text-to-video\")                               │\n│   2 print(model_name)                                                                    │\n╰──────────────────────────────────────────────────────────────────────────────────────────╯\nExecution logs:\nByteDance/AnimateDiff-Lightning\n\nOut: None\n[Step 0: Duration 0.27 seconds| Input tokens: 2,069 | Output tokens: 60]\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n╭─ Executing this code: ───────────────────────────────────────────────────────────────────╮\n│   1 final_answer(\"ByteDance/AnimateDiff-Lightning\")                                      │\n╰──────────────────────────────────────────────────────────────────────────────────────────╯\nOut - Final answer: ByteDance/AnimateDiff-Lightning\n[Step 1: Duration 0.10 seconds| Input tokens: 4,288 | Output tokens: 148]\nOut[20]: 'ByteDance/AnimateDiff-Lightning'\n```\n\n> [!TIP]\n> Read more on tools in the [dedicated tutorial](./tutorials/tools#what-is-a-tool-and-how-to-build-one).\n\n## Multi-agents\n\nMulti-agent systems have been introduced with Microsoft's framework [Autogen](https://huggingface.co/papers/2308.08155).\n\nIn this type of framework, you have several agents working together to solve your task instead of only one.\nIt empirically yields better performance on most benchmarks. The reason for this better performance is conceptually simple: for many tasks, rather than using a do-it-all system, you would prefer to specialize units on sub-tasks. Here, having agents with separate tool sets and memories allows to achieve efficient specialization. For instance, why fill the memory of the code generating agent with all the content of webpages visited by the web search agent? It's better to keep them separate.\n\nYou can easily build hierarchical multi-agent systems with `smolagents`.\n\nTo do so, just ensure your agent has `name` and`description` attributes, which will then be embedded in the manager agent's system prompt to let it know how to call this managed agent, as we also do for tools.\nThen you can pass this managed agent in the parameter managed_agents upon initialization of the manager agent.\n\nHere's an example of making an agent that managed a specific web search agent using our native [`WebSearchTool`]:\n\n```py\nfrom smolagents import CodeAgent, InferenceClientModel, WebSearchTool\n\nmodel = InferenceClientModel()\n\nweb_agent = CodeAgent(\n    tools=[WebSearchTool()],\n    model=model,\n    name=\"web_search_agent\",\n    description=\"Runs web searches for you. Give it your query as an argument.\"\n)\n\nmanager_agent = CodeAgent(\n    tools=[], model=model, managed_agents=[web_agent]\n)\n\nmanager_agent.run(\"Who is the CEO of Hugging Face?\")\n```\n\n> [!TIP]\n> For an in-depth example of an efficient multi-agent implementation, see [how we pushed our multi-agent system to the top of the GAIA leaderboard](https://huggingface.co/blog/beating-gaia).\n\n\n## Talk with your agent and visualize its thoughts in a cool Gradio interface\n\nYou can use `GradioUI` to interactively submit tasks to your agent and observe its thought and execution process, here is an example:\n\n```py\nfrom smolagents import (\n    load_tool,\n    CodeAgent,\n    InferenceClientModel,\n    GradioUI\n)\n\n# Import tool from Hub\nimage_generation_tool = load_tool(\"m-ric/text-to-image\", trust_remote_code=True)\n\nmodel = InferenceClientModel(model_id=model_id)\n\n# Initialize the agent with the image generation tool\nagent = CodeAgent(tools=[image_generation_tool], model=model)\n\nGradioUI(agent).launch()\n```\n\nUnder the hood, when the user types a new answer, the agent is launched with `agent.run(user_request, reset=False)`.\nThe `reset=False` flag means the agent's memory is not flushed before launching this new task, which lets the conversation go on.\n\nYou can also use this `reset=False` argument to keep the conversation going in any other agentic application.\n\nIn gradio UIs, if you want to allow users to interrupt a running agent, you could do this with a button that triggers method `agent.interrupt()`.\nThis will stop the agent at the end of its current step, then raise an error.\n\n## Next steps\n\nFinally, when you've configured your agent to your needs, you can share it to the Hub!\n\n```py\nagent.push_to_hub(\"m-ric/my_agent\")\n```\n\nSimilarly, to load an agent that has been pushed to hub, if you trust the code from its tools, use:\n```py\nagent.from_hub(\"m-ric/my_agent\", trust_remote_code=True)\n```\n\nFor more in-depth usage, you will then want to check out our tutorials:\n- [the explanation of how our code agents work](./tutorials/secure_code_execution)\n- [this guide on how to build good agents](./tutorials/building_good_agents).\n- [the in-depth guide for tool usage](./tutorials/building_good_agents).\n"
  },
  {
    "path": "docs/source/en/index.md",
    "content": "# `smolagents`\n\n<div class=\"flex justify-center\">\n    <img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/license_to_call.png\" style=\"max-width:700px\"/>\n</div>\n\n## What is smolagents?\n\n`smolagents` is an open-source Python library designed to make it extremely easy to build and run agents using just a few lines of code.\n\nKey features of `smolagents` include:\n\n✨ **Simplicity**: The logic for agents fits in ~thousand lines of code. We kept abstractions to their minimal shape above raw code!\n\n🧑‍💻 **First-class support for Code Agents**: [`CodeAgent`](reference/agents#smolagents.CodeAgent) writes its actions in code (as opposed to \"agents being used to write code\") to invoke tools or perform computations, enabling natural composability (function nesting, loops, conditionals). To make it secure, we support [executing in sandboxed environment](tutorials/secure_code_execution) via [Modal](https://modal.com/), [Blaxel](https://blaxel.ai), [E2B](https://e2b.dev/), or Docker.\n\n📡 **Common Tool-Calling Agent Support**: In addition to CodeAgents, [`ToolCallingAgent`](reference/agents#smolagents.ToolCallingAgent) supports usual JSON/text-based tool-calling for scenarios where that paradigm is preferred.\n\n🤗 **Hub integrations**: Seamlessly share and load agents and tools to/from the Hub as Gradio Spaces.\n\n🌐 **Model-agnostic**: Easily integrate any large language model (LLM), whether it's hosted on the Hub via [Inference providers](https://huggingface.co/docs/inference-providers/index), accessed via APIs such as OpenAI, Anthropic, or many others via LiteLLM integration, or run locally using Transformers or Ollama. Powering an agent with your preferred LLM is straightforward and flexible.\n\n👁️ **Modality-agnostic**: Beyond text, agents can handle vision, video, and audio inputs, broadening the range of possible applications. Check out [this tutorial](examples/web_browser) for vision.\n\n🛠️ **Tool-agnostic**: You can use tools from any [MCP server](reference/tools#smolagents.ToolCollection.from_mcp), from [LangChain](reference/tools#smolagents.Tool.from_langchain), you can even use a [Hub Space](reference/tools#smolagents.Tool.from_space) as a tool.\n\n💻 **CLI Tools**: Comes with command-line utilities (smolagent, webagent) for quickly running agents without writing boilerplate code.\n\n## Quickstart\n\n[[open-in-colab]]\n\nGet started with smolagents in just a few minutes! This guide will show you how to create and run your first agent.\n\n### Installation\n\nInstall smolagents with pip:\n\n```bash\npip install 'smolagents[toolkit]'  # Includes default tools like web search\n```\n\n### Create Your First Agent\n\nHere's a minimal example to create and run an agent:\n\n```python\nfrom smolagents import CodeAgent, InferenceClientModel\n\n# Initialize a model (using Hugging Face Inference API)\nmodel = InferenceClientModel()  # Uses a default model\n\n# Create an agent with no tools\nagent = CodeAgent(tools=[], model=model)\n\n# Run the agent with a task\nresult = agent.run(\"Calculate the sum of numbers from 1 to 10\")\nprint(result)\n```\n\nThat's it! Your agent will use Python code to solve the task and return the result.\n\n### Adding Tools\n\nLet's make our agent more capable by adding some tools:\n\n```python\nfrom smolagents import CodeAgent, InferenceClientModel, DuckDuckGoSearchTool\n\nmodel = InferenceClientModel()\nagent = CodeAgent(\n    tools=[DuckDuckGoSearchTool()],\n    model=model,\n)\n\n# Now the agent can search the web!\nresult = agent.run(\"What is the current weather in Paris?\")\nprint(result)\n```\n\n### Using Different Models\n\nYou can use various models with your agent:\n\n```python\n# Using a specific model from Hugging Face\nmodel = InferenceClientModel(model_id=\"meta-llama/Llama-2-70b-chat-hf\")\n\n# Using OpenAI/Anthropic (requires 'smolagents[litellm]')\nfrom smolagents import LiteLLMModel\nmodel = LiteLLMModel(model_id=\"gpt-4\")\n\n# Using local models (requires 'smolagents[transformers]')\nfrom smolagents import TransformersModel\nmodel = TransformersModel(model_id=\"meta-llama/Llama-2-7b-chat-hf\")\n```\n\n## Next Steps\n\n- Learn how to set up smolagents with various models and tools in the [Installation Guide](installation)\n- Check out the [Guided Tour](guided_tour) for more advanced features\n- Learn about [building custom tools](tutorials/tools)\n- Explore [secure code execution](tutorials/secure_code_execution)\n- See how to create [multi-agent systems](tutorials/building_good_agents)\n\n<div class=\"mt-10\">\n  <div class=\"w-full flex flex-col space-y-4 md:space-y-0 md:grid md:grid-cols-2 md:gap-y-4 md:gap-x-5\">\n    <a class=\"!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg\" href=\"./guided_tour\"\n      ><div class=\"w-full text-center bg-gradient-to-br from-blue-400 to-blue-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed\">Guided tour</div>\n      <p class=\"text-gray-700\">Learn the basics and become familiar with using Agents. Start here if you are using Agents for the first time!</p>\n    </a>\n    <a class=\"!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg\" href=\"./examples/text_to_sql\"\n      ><div class=\"w-full text-center bg-gradient-to-br from-indigo-400 to-indigo-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed\">How-to guides</div>\n      <p class=\"text-gray-700\">Practical guides to help you achieve a specific goal: create an agent to generate and test SQL queries!</p>\n    </a>\n    <a class=\"!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg\" href=\"./conceptual_guides/intro_agents\"\n      ><div class=\"w-full text-center bg-gradient-to-br from-pink-400 to-pink-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed\">Conceptual guides</div>\n      <p class=\"text-gray-700\">High-level explanations for building a better understanding of important topics.</p>\n   </a>\n    <a class=\"!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg\" href=\"./tutorials/building_good_agents\"\n      ><div class=\"w-full text-center bg-gradient-to-br from-purple-400 to-purple-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed\">Tutorials</div>\n      <p class=\"text-gray-700\">Horizontal tutorials that cover important aspects of building agents.</p>\n    </a>\n  </div>\n</div>\n"
  },
  {
    "path": "docs/source/en/installation.md",
    "content": "# Installation Options\n\nThe `smolagents` library can be installed using pip. Here are the different installation methods and options available.\n\n## Prerequisites\n- Python 3.10 or newer\n- Python package manager: [`pip`](https://pip.pypa.io/en/stable/) or [`uv`](https://docs.astral.sh/uv/)\n\n## Virtual Environment\n\nIt's strongly recommended to install `smolagents` within a Python virtual environment.\nVirtual environments isolate your project dependencies from other Python projects and your system Python installation,\npreventing version conflicts and making package management more reliable.\n\n<hfoptions id=\"virtual-environment\">\n<hfoption id=\"venv\">\n\nUsing [`venv`](https://docs.python.org/3/library/venv.html):\n\n```bash\npython -m venv .venv\nsource .venv/bin/activate\n```\n\n</hfoption>\n<hfoption id=\"uv\">\n\nUsing [`uv`](https://docs.astral.sh/uv/):\n\n```bash\nuv venv .venv\nsource .venv/bin/activate\n```\n\n</hfoption>\n</hfoptions>\n\n## Basic Installation\n\nInstall `smolagents` core library with:\n\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n```bash\npip install smolagents\n```\n</hfoption>\n<hfoption id=\"uv\">\n```bash\nuv pip install smolagents\n```\n</hfoption>\n</hfoptions>\n\n## Installation with Extras\n\n`smolagents` provides several optional dependencies (extras) that can be installed based on your needs.\nYou can install these extras using the following syntax:\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n```bash\npip install \"smolagents[extra1,extra2]\"\n```\n</hfoption>\n<hfoption id=\"uv\">\n```bash\nuv pip install \"smolagents[extra1,extra2]\"\n```\n</hfoption>\n</hfoptions>\n\n### Tools\nThese extras include various tools and integrations:\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n- **toolkit**: Install a default set of tools for common tasks.\n  ```bash\n  pip install \"smolagents[toolkit]\"\n  ```\n- **mcp**: Add support for the Model Context Protocol (MCP) to integrate with external tools and services.\n  ```bash\n  pip install \"smolagents[mcp]\"\n  ```\n</hfoption>\n<hfoption id=\"uv\">\n- **toolkit**: Install a default set of tools for common tasks.\n  ```bash\n  uv pip install \"smolagents[toolkit]\"\n  ```\n- **mcp**: Add support for the Model Context Protocol (MCP) to integrate with external tools and services.\n  ```bash\n  uv pip install \"smolagents[mcp]\"\n  ```\n</hfoption>\n</hfoptions>\n\n### Model Integration\nThese extras enable integration with various AI models and frameworks:\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n- **openai**: Add support for OpenAI API models.\n  ```bash\n  pip install \"smolagents[openai]\"\n  ```\n- **transformers**: Enable Hugging Face Transformers models.\n  ```bash\n  pip install \"smolagents[transformers]\"\n  ```\n- **vllm**: Add VLLM support for efficient model inference.\n  ```bash\n  pip install \"smolagents[vllm]\"\n  ```\n- **mlx-lm**: Enable support for MLX-LM models.\n  ```bash\n  pip install \"smolagents[mlx-lm]\"\n  ```\n- **litellm**: Add LiteLLM support for lightweight model inference.\n  ```bash\n  pip install \"smolagents[litellm]\"\n  ```\n- **bedrock**: Enable support for AWS Bedrock models.\n  ```bash\n  pip install \"smolagents[bedrock]\"\n  ```\n</hfoption>\n<hfoption id=\"uv\">\n- **openai**: Add support for OpenAI API models.\n  ```bash\n  uv pip install \"smolagents[openai]\"\n  ```\n- **transformers**: Enable Hugging Face Transformers models.\n  ```bash\n  uv pip install \"smolagents[transformers]\"\n  ```\n- **vllm**: Add VLLM support for efficient model inference.\n  ```bash\n  uv pip install \"smolagents[vllm]\"\n  ```\n- **mlx-lm**: Enable support for MLX-LM models.\n  ```bash\n  uv pip install \"smolagents[mlx-lm]\"\n  ```\n- **litellm**: Add LiteLLM support for lightweight model inference.\n  ```bash\n  uv pip install \"smolagents[litellm]\"\n  ```\n- **bedrock**: Enable support for AWS Bedrock models.\n  ```bash\n  uv pip install \"smolagents[bedrock]\"\n  ```\n</hfoption>\n</hfoptions>\n\n### Multimodal Capabilities\nExtras for handling different types of media and input:\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n- **vision**: Add support for image processing and computer vision tasks.\n  ```bash\n  pip install \"smolagents[vision]\"\n  ```\n- **audio**: Enable audio processing capabilities.\n  ```bash\n  pip install \"smolagents[audio]\"\n  ```\n</hfoption>\n<hfoption id=\"uv\">\n- **vision**: Add support for image processing and computer vision tasks.\n  ```bash\n  uv pip install \"smolagents[vision]\"\n  ```\n- **audio**: Enable audio processing capabilities.\n  ```bash\n  uv pip install \"smolagents[audio]\"\n  ```\n</hfoption>\n</hfoptions>\n\n### Remote Execution\nExtras for executing code remotely:\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n- **blaxel**: Add support for Blaxel sandboxes - fast-launching VMs with hibernation (recommended).\n  ```bash\n  pip install \"smolagents[blaxel]\"\n  ```\n- **e2b**: Enable E2B support for remote execution.\n  ```bash\n  pip install \"smolagents[e2b]\"\n  ```\n- **docker**: Add support for executing code in Docker containers.\n  ```bash\n  pip install \"smolagents[docker]\"\n  ```\n</hfoption>\n<hfoption id=\"uv\">\n- **blaxel**: Add support for Blaxel sandboxes - fast-launching VMs with hibernation (recommended).\n  ```bash\n  uv pip install \"smolagents[blaxel]\"\n  ```\n- **e2b**: Enable E2B support for remote execution.\n  ```bash\n  uv pip install \"smolagents[e2b]\"\n  ```\n- **docker**: Add support for executing code in Docker containers.\n  ```bash\n  uv pip install \"smolagents[docker]\"\n  ```\n</hfoption>\n</hfoptions>\n\n### Telemetry and User Interface\nExtras for telemetry, monitoring and user interface components:\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n- **telemetry**: Add support for monitoring and tracing.\n  ```bash\n  pip install \"smolagents[telemetry]\"\n  ```\n- **gradio**: Add support for interactive Gradio UI components.\n  ```bash\n  pip install \"smolagents[gradio]\"\n  ```\n</hfoption>\n<hfoption id=\"uv\">\n- **telemetry**: Add support for monitoring and tracing.\n  ```bash\n  uv pip install \"smolagents[telemetry]\"\n  ```\n- **gradio**: Add support for interactive Gradio UI components.\n  ```bash\n  uv pip install \"smolagents[gradio]\"\n  ```\n</hfoption>\n</hfoptions>\n\n### Complete Installation\nTo install all available extras, you can use:\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n```bash\npip install \"smolagents[all]\"\n```\n</hfoption>\n<hfoption id=\"uv\">\n```bash\nuv pip install \"smolagents[all]\"\n```\n</hfoption>\n</hfoptions>\n\n## Verifying Installation\nAfter installation, you can verify that `smolagents` is installed correctly by running:\n```python\nimport smolagents\nprint(smolagents.__version__)\n```\n\n## Next Steps\nOnce you have successfully installed `smolagents`, you can:\n- Follow the [guided tour](./guided_tour) to learn the basics.\n- Explore the [how-to guides](./examples/text_to_sql) for practical examples.\n- Read the [conceptual guides](./conceptual_guides/intro_agents) for high-level explanations.\n- Check out the [tutorials](./tutorials/building_good_agents) for in-depth tutorials on building agents.\n- Explore the [API reference](./reference/index) for detailed information on classes and functions.\n"
  },
  {
    "path": "docs/source/en/reference/agents.md",
    "content": "# Agents\n\n<Tip warning={true}>\n\nSmolagents is an experimental API which is subject to change at any time. Results returned by the agents\ncan vary as the APIs or underlying models are prone to change.\n\n</Tip>\n\nTo learn more about agents and tools make sure to read the [introductory guide](../index). This page\ncontains the API docs for the underlying classes.\n\n## Agents\n\nOur agents inherit from [`MultiStepAgent`], which means they can act in multiple steps, each step consisting of one thought, then one tool call and execution. Read more in [this conceptual guide](../conceptual_guides/react).\n\nWe provide two types of agents, based on the main [`Agent`] class.\n  - [`CodeAgent`] writes its tool calls in Python code (this is the default).\n  - [`ToolCallingAgent`] writes its tool calls in JSON.\n\nBoth require arguments `model` and list of tools `tools` at initialization.\n\n### Classes of agents\n\n[[autodoc]] MultiStepAgent\n\n[[autodoc]] CodeAgent\n\n[[autodoc]] ToolCallingAgent\n\n### stream_to_gradio\n\n[[autodoc]] stream_to_gradio\n\n### GradioUI\n\n> [!TIP]\n> You must have `gradio` installed to use the UI. Please run `pip install 'smolagents[gradio]'` if it's not the case.\n\n[[autodoc]] GradioUI\n\n## Prompts\n\n[[autodoc]] smolagents.agents.PromptTemplates\n\n[[autodoc]] smolagents.agents.PlanningPromptTemplate\n\n[[autodoc]] smolagents.agents.ManagedAgentPromptTemplate\n\n[[autodoc]] smolagents.agents.FinalAnswerPromptTemplate\n\n## Memory\n\nSmolagents use memory to store information across multiple steps.\n\n[[autodoc]] smolagents.memory.AgentMemory\n"
  },
  {
    "path": "docs/source/en/reference/default_tools.md",
    "content": "# Built-in Tools\n\nReady-to-use tool implementations provided by the `smolagents` library.\n\nThese built-in tools are concrete implementations of the [`Tool`] base class, each designed for specific tasks such as web searching, Python code execution, webpage retrieval, and user interaction.\nYou can use these tools directly in your agents without having to implement the underlying functionality yourself.\nEach tool handles a particular capability and follows a consistent interface, making it easy to compose them into powerful agent workflows.\n\nThe built-in tools can be categorized by their primary functions:\n- **Information Retrieval**: Search and retrieve information from the web and specific knowledge sources.\n  - [`ApiWebSearchTool`]\n  - [`DuckDuckGoSearchTool`]\n  - [`GoogleSearchTool`]\n  - [`WebSearchTool`]\n  - [`WikipediaSearchTool`]\n- **Web Interaction**: Fetch and process content from specific web pages.\n  - [`VisitWebpageTool`]\n- **Code Execution**: Dynamic execution of Python code for computational tasks.\n  - [`PythonInterpreterTool`]\n- **User Interaction**: Enable Human-in-the-Loop collaboration between agents and users.\n  - [`UserInputTool`]: Collect input from users.\n- **Speech Processing**: Convert audio to textual data.\n  - [`SpeechToTextTool`]\n- **Workflow Control**: Manage and direct the flow of agent operations.\n  - [`FinalAnswerTool`]: Conclude agent workflow with final response.\n\n## ApiWebSearchTool\n\n[[autodoc]] smolagents.default_tools.ApiWebSearchTool\n\n## DuckDuckGoSearchTool\n\n[[autodoc]] smolagents.default_tools.DuckDuckGoSearchTool\n\n## FinalAnswerTool\n\n[[autodoc]] smolagents.default_tools.FinalAnswerTool\n\n## GoogleSearchTool\n\n[[autodoc]] smolagents.default_tools.GoogleSearchTool\n\n## PythonInterpreterTool\n\n[[autodoc]] smolagents.default_tools.PythonInterpreterTool\n\n## SpeechToTextTool\n\n[[autodoc]] smolagents.default_tools.SpeechToTextTool\n\n## UserInputTool\n\n[[autodoc]] smolagents.default_tools.UserInputTool\n\n## VisitWebpageTool\n\n[[autodoc]] smolagents.default_tools.VisitWebpageTool\n\n## WebSearchTool\n\n[[autodoc]] smolagents.default_tools.WebSearchTool\n\n## WikipediaSearchTool\n\n[[autodoc]] smolagents.default_tools.WikipediaSearchTool\n"
  },
  {
    "path": "docs/source/en/reference/models.md",
    "content": "# Models\n\n<Tip warning={true}>\n\nSmolagents is an experimental API which is subject to change at any time. Results returned by the agents\ncan vary as the APIs or underlying models are prone to change.\n\n</Tip>\n\nTo learn more about agents and tools make sure to read the [introductory guide](../index). This page\ncontains the API docs for the underlying classes.\n\n## Models\n\nAll model classes in smolagents support passing additional keyword arguments (like `temperature`, `max_tokens`, `top_p`, etc.) directly at instantiation time.\nThese parameters are automatically forwarded to the underlying model's completion calls, allowing you to configure model behavior such as creativity, response length, and sampling strategies.\n\n### Base Model\n\nThe `Model` class serves as the foundation for all model implementations, providing the core interface that custom models must implement to work with agents.\n\n[[autodoc]] Model\n\n### API Model\n\nThe `ApiModel` class serves as the foundation for all API-based model implementations, providing common functionality for external API interactions, rate limiting, and client management that API-specific models inherit.\n\n[[autodoc]] ApiModel\n\n### TransformersModel\n\nFor convenience, we have added a `TransformersModel` that implements the points above by building a local `transformers` pipeline for the model_id given at initialization.\n\n```python\nfrom smolagents import TransformersModel\n\nmodel = TransformersModel(model_id=\"HuggingFaceTB/SmolLM-135M-Instruct\")\n\nprint(model([{\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"Ok!\"}]}], stop_sequences=[\"great\"]))\n```\n```text\n>>> What a\n```\n\nYou can pass any keyword arguments supported by the underlying model (such as `temperature`, `max_new_tokens`, `top_p`, etc.) directly at instantiation time. These are forwarded to the model completion call:\n\n```python\nmodel = TransformersModel(\n    model_id=\"HuggingFaceTB/SmolLM-135M-Instruct\",\n    temperature=0.7,\n    max_new_tokens=1000\n)\n```\n\n> [!TIP]\n> You must have `transformers` and `torch` installed on your machine. Please run `pip install 'smolagents[transformers]'` if it's not the case.\n\n[[autodoc]] TransformersModel\n\n### InferenceClientModel\n\nThe `InferenceClientModel` wraps huggingface_hub's [InferenceClient](https://huggingface.co/docs/huggingface_hub/main/en/guides/inference) for the execution of the LLM. It supports all [Inference Providers](https://huggingface.co/docs/inference-providers/index) available on the Hub: Cerebras, Cohere, Fal, Fireworks, HF-Inference, Hyperbolic, Nebius, Novita, Replicate, SambaNova, Together, and more.\n\nYou can also set a rate limit in requests per minute by using the `requests_per_minute` argument:\n\n```python\nfrom smolagents import InferenceClientModel\n\nmessages = [\n  {\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"Hello, how are you?\"}]}\n]\n\nmodel = InferenceClientModel(provider=\"novita\", requests_per_minute=60)\nprint(model(messages))\n```\n```text\n>>> Of course! If you change your mind, feel free to reach out. Take care!\n```\n\nYou can pass any keyword arguments supported by the underlying model (such as `temperature`, `max_tokens`, `top_p`, etc.) directly at instantiation time. These are forwarded to the model completion call:\n\n```python\nmodel = InferenceClientModel(\n    provider=\"novita\",\n    requests_per_minute=60,\n    temperature=0.8,\n    max_tokens=500\n)\n```\n\n[[autodoc]] InferenceClientModel\n\n### LiteLLMModel\n\nThe `LiteLLMModel` leverages [LiteLLM](https://www.litellm.ai/) to support 100+ LLMs from various providers.\nYou can pass kwargs upon model initialization that will then be used whenever using the model, for instance below we pass `temperature`. You can also set a rate limit in requests per minute by using the `requests_per_minute` argument.\n\n```python\nfrom smolagents import LiteLLMModel\n\nmessages = [\n  {\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"Hello, how are you?\"}]}\n]\n\nmodel = LiteLLMModel(model_id=\"anthropic/claude-3-5-sonnet-latest\", temperature=0.2, max_tokens=10, requests_per_minute=60)\nprint(model(messages))\n```\n\n[[autodoc]] LiteLLMModel\n\n### LiteLLMRouterModel\n\nThe `LiteLLMRouterModel` is a wrapper around the [LiteLLM Router](https://docs.litellm.ai/docs/routing) that leverages\nadvanced routing strategies: load-balancing across multiple deployments, prioritizing critical requests via queueing,\nand implementing basic reliability measures such as cooldowns, fallbacks, and exponential backoff retries.\n\n```python\nfrom smolagents import LiteLLMRouterModel\n\nmessages = [\n  {\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"Hello, how are you?\"}]}\n]\n\nmodel = LiteLLMRouterModel(\n    model_id=\"llama-3.3-70b\",\n    model_list=[\n        {\n            \"model_name\": \"llama-3.3-70b\",\n            \"litellm_params\": {\"model\": \"groq/llama-3.3-70b\", \"api_key\": os.getenv(\"GROQ_API_KEY\")},\n        },\n        {\n            \"model_name\": \"llama-3.3-70b\",\n            \"litellm_params\": {\"model\": \"cerebras/llama-3.3-70b\", \"api_key\": os.getenv(\"CEREBRAS_API_KEY\")},\n        },\n    ],\n    client_kwargs={\n        \"routing_strategy\": \"simple-shuffle\",\n    },\n)\nprint(model(messages))\n```\n\n[[autodoc]] LiteLLMRouterModel\n\n### OpenAIModel\n\nThis class lets you call any OpenAIServer compatible model.\nHere's how you can set it (you can customise the `api_base` url to point to another server):\n```py\nimport os\nfrom smolagents import OpenAIModel\n\nmodel = OpenAIModel(\n    model_id=\"gpt-4o\",\n    api_base=\"https://api.openai.com/v1\",\n    api_key=os.environ[\"OPENAI_API_KEY\"],\n)\n```\n\nYou can pass any keyword arguments supported by the underlying model (such as `temperature`, `max_tokens`, `top_p`, etc.) directly at instantiation time. These are forwarded to the model completion call:\n\n```py\nmodel = OpenAIModel(\n    model_id=\"gpt-4o\",\n    api_base=\"https://api.openai.com/v1\",\n    api_key=os.environ[\"OPENAI_API_KEY\"],\n    temperature=0.7,\n    max_tokens=1000,\n    top_p=0.9,\n)\n```\n\n[[autodoc]] OpenAIModel\n\n### AzureOpenAIModel\n\n`AzureOpenAIModel` allows you to connect to any Azure OpenAI deployment. \n\nBelow you can find an example of how to set it up, note that you can omit the `azure_endpoint`, `api_key`, and `api_version` arguments, provided you've set the corresponding environment variables -- `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_API_KEY`, and `OPENAI_API_VERSION`.\n\nPay attention to the lack of an `AZURE_` prefix for `OPENAI_API_VERSION`, this is due to the way the underlying [openai](https://github.com/openai/openai-python) package is designed. \n\n```py\nimport os\n\nfrom smolagents import AzureOpenAIModel\n\nmodel = AzureOpenAIModel(\n    model_id = os.environ.get(\"AZURE_OPENAI_MODEL\"),\n    azure_endpoint=os.environ.get(\"AZURE_OPENAI_ENDPOINT\"),\n    api_key=os.environ.get(\"AZURE_OPENAI_API_KEY\"),\n    api_version=os.environ.get(\"OPENAI_API_VERSION\")    \n)\n```\n\n[[autodoc]] AzureOpenAIModel\n\n### AmazonBedrockModel\n\n`AmazonBedrockModel` helps you connect to Amazon Bedrock and run your agent with any available models.\n\nBelow is an example setup. This class also offers additional options for customization.\n\n```py\nimport os\n\nfrom smolagents import AmazonBedrockModel\n\nmodel = AmazonBedrockModel(\n    model_id = os.environ.get(\"AMAZON_BEDROCK_MODEL_ID\"),\n)\n```\n\n[[autodoc]] AmazonBedrockModel\n\n### MLXModel\n\n\n```python\nfrom smolagents import MLXModel\n\nmodel = MLXModel(model_id=\"HuggingFaceTB/SmolLM-135M-Instruct\")\n\nprint(model([{\"role\": \"user\", \"content\": \"Ok!\"}], stop_sequences=[\"great\"]))\n```\n```text\n>>> What a\n```\n\n> [!TIP]\n> You must have `mlx-lm` installed on your machine. Please run `pip install 'smolagents[mlx-lm]'` if it's not the case.\n\n[[autodoc]] MLXModel\n\n### VLLMModel\n\nModel to use [vLLM](https://docs.vllm.ai/) for fast LLM inference and serving.\n\n```python\nfrom smolagents import VLLMModel\n\nmodel = VLLMModel(model_id=\"HuggingFaceTB/SmolLM-135M-Instruct\")\n\nprint(model([{\"role\": \"user\", \"content\": \"Ok!\"}], stop_sequences=[\"great\"]))\n```\n\n> [!TIP]\n> You must have `vllm` installed on your machine. Please run `pip install 'smolagents[vllm]'` if it's not the case.\n\n[[autodoc]] VLLMModel\n\n### Custom Model\n\nYou're free to create and use your own models to power your agent.\n\nYou could subclass the base `Model` class to create a model for your agent.\nThe main criteria is to subclass the `generate` method, with these two criteria:\n1. It follows the [messages format](./chat_templating) (`List[Dict[str, str]]`) for its input `messages`, and it returns an object with a `.content` attribute.\n2. It stops generating outputs at the sequences passed in the argument `stop_sequences`.\n\nFor defining your LLM, you can make a `CustomModel` class that inherits from the base `Model` class.\nIt should have a generate method that takes a list of [messages](./chat_templating) and returns an object with a .content attribute containing the text. The `generate` method also needs to accept a `stop_sequences` argument that indicates when to stop generating.\n\n```python\nfrom huggingface_hub import login, InferenceClient\n\nfrom smolagents import Model\n\nlogin(\"<YOUR_HUGGINGFACEHUB_API_TOKEN>\")\n\nmodel_id = \"meta-llama/Llama-3.3-70B-Instruct\"\n\nclient = InferenceClient(model=model_id)\n\nclass CustomModel(Model):\n    def generate(messages, stop_sequences=[\"Task\"]):\n        response = client.chat_completion(messages, stop=stop_sequences, max_tokens=1024)\n        answer = response.choices[0].message\n        return answer\n\ncustom_model = CustomModel()\n```\n\nAdditionally, `generate` can also take a `grammar` argument to allow [constrained generation](https://huggingface.co/docs/text-generation-inference/conceptual/guidance) in order to force properly-formatted agent outputs.\n"
  },
  {
    "path": "docs/source/en/reference/python_executors.md",
    "content": "# Python code executors\n\nPython executors are responsible for running the code generated by code agents in a controlled environment.\nSince agents dynamically generate and execute Python code to accomplish tasks, choosing the right executor is critical\nfor both functionality and security.\n\nTo learn more about code execution and its risks, make sure to read the [Secure code execution](../tutorials/secure_code_execution)\ntutorial. This reference contains the API docs for the underlying classes: the base `PythonExecutor` interface and all \navailable executor implementations.\n\n## Python executor\n\n[[autodoc]] smolagents.local_python_executor.PythonExecutor\n\n## Local Python executor\n\n[[autodoc]] smolagents.local_python_executor.LocalPythonExecutor\n\n## Remote Python executors\n\n[[autodoc]] smolagents.remote_executors.RemotePythonExecutor\n\n### BlaxelExecutor\n\n[[autodoc]] smolagents.remote_executors.BlaxelExecutor\n\n### E2BExecutor\n\n[[autodoc]] smolagents.remote_executors.E2BExecutor\n\n### ModalExecutor\n\n[[autodoc]] smolagents.remote_executors.ModalExecutor\n\n### DockerExecutor\n\n[[autodoc]] smolagents.remote_executors.DockerExecutor\n\n### WasmExecutor\n\n[[autodoc]] smolagents.remote_executors.WasmExecutor\n"
  },
  {
    "path": "docs/source/en/reference/tools.md",
    "content": "# Tools\n\n<Tip warning={true}>\n\nSmolagents is an experimental API which is subject to change at any time. Results returned by the agents\ncan vary as the APIs or underlying models are prone to change.\n\n</Tip>\n\nTo learn more about agents and tools make sure to read the [introductory guide](../index). This page\ncontains the API docs for the underlying classes.\n\n## Tool Base Classes\n\n### load_tool\n\n[[autodoc]] load_tool\n\n### tool\n\n[[autodoc]] tool\n\n### Tool\n\n[[autodoc]] Tool\n\n### launch_gradio_demo\n\n[[autodoc]] launch_gradio_demo\n\n## ToolCollection\n\n[[autodoc]] ToolCollection\n\n## MCP Client\n\n[[autodoc]] smolagents.mcp_client.MCPClient\n\n## Agent Types\n\nAgents can handle any type of object in-between tools; tools, being completely multimodal, can accept and return\ntext, image, audio, video, among other types. In order to increase compatibility between tools, as well as to\ncorrectly render these returns in ipython (jupyter, colab, ipython notebooks, ...), we implement wrapper classes\naround these types.\n\nThe wrapped objects should continue behaving as initially; a text object should still behave as a string, an image\nobject should still behave as a `PIL.Image`.\n\nThese types have three specific purposes:\n\n- Calling `to_raw` on the type should return the underlying object\n- Calling `to_string` on the type should return the object as a string: that can be the string in case of an `AgentText`\n  but will be the path of the serialized version of the object in other instances\n- Displaying it in an ipython kernel should display the object correctly\n\n### AgentText\n\n[[autodoc]] smolagents.agent_types.AgentText\n\n### AgentImage\n\n[[autodoc]] smolagents.agent_types.AgentImage\n\n### AgentAudio\n\n[[autodoc]] smolagents.agent_types.AgentAudio\n"
  },
  {
    "path": "docs/source/en/tutorials/building_good_agents.md",
    "content": "# Building good agents\n\n[[open-in-colab]]\n\nThere's a world of difference between building an agent that works and one that doesn't.\nHow can we build agents that fall into the former category?\nIn this guide, we're going to talk about best practices for building agents.\n\n> [!TIP]\n> If you're new to building agents, make sure to first read the [intro to agents](../conceptual_guides/intro_agents) and the [guided tour of smolagents](../guided_tour).\n\n### The best agentic systems are the simplest: simplify the workflow as much as you can\n\nGiving an LLM some agency in your workflow introduces some risk of errors.\n\nWell-programmed agentic systems have good error logging and retry mechanisms anyway, so the LLM engine has a chance to self-correct their mistake. But to reduce the risk of LLM error to the maximum, you should simplify your workflow!\n\nLet's revisit the example from the [intro to agents](../conceptual_guides/intro_agents): a bot that answers user queries for a surf trip company.\nInstead of letting the agent do 2 different calls for \"travel distance API\" and \"weather API\" each time they are asked about a new surf spot, you could just make one unified tool \"return_spot_information\", a function that calls both APIs at once and returns their concatenated outputs to the user.\n\nThis will reduce costs, latency, and error risk!\n\nThe main guideline is: Reduce the number of LLM calls as much as you can.\n\nThis leads to a few takeaways:\n- Whenever possible, group 2 tools in one, like in our example of the two APIs.\n- Whenever possible, logic should be based on deterministic functions rather than agentic decisions.\n\n### Improve the information flow to the LLM engine\n\nRemember that your LLM engine is like an *intelligent* robot, trapped into a room with the only communication with the outside world being notes passed under a door.\n\nIt won't know of anything that happened if you don't explicitly put that into its prompt.\n\nSo first start with making your task very clear!\nSince an agent is powered by an LLM, minor variations in your task formulation might yield completely different results.\n\nThen, improve the information flow towards your agent in tool use.\n\nParticular guidelines to follow:\n- Each tool should log (by simply using `print` statements inside the tool's `forward` method) everything that could be useful for the LLM engine.\n  - In particular, logging detail on tool execution errors would help a lot!\n\nFor instance, here's a tool that retrieves weather data based on location and date-time:\n\nFirst, here's a poor version:\n```python\nimport datetime\nfrom smolagents import tool\n\ndef get_weather_report_at_coordinates(coordinates, date_time):\n    # Dummy function, returns a list of [temperature in °C, risk of rain on a scale 0-1, wave height in m]\n    return [28.0, 0.35, 0.85]\n\ndef convert_location_to_coordinates(location):\n    # Returns dummy coordinates\n    return [3.3, -42.0]\n\n@tool\ndef get_weather_api(location: str, date_time: str) -> str:\n    \"\"\"\n    Returns the weather report.\n\n    Args:\n        location: the name of the place that you want the weather for.\n        date_time: the date and time for which you want the report.\n    \"\"\"\n    lon, lat = convert_location_to_coordinates(location)\n    date_time = datetime.strptime(date_time)\n    return str(get_weather_report_at_coordinates((lon, lat), date_time))\n```\n\nWhy is it bad?\n- there's no precision of the format that should be used for `date_time`\n- there's no detail on how location should be specified.\n- there's no logging mechanism trying to make explicit failure cases like location not being in a proper format, or date_time not being properly formatted.\n- the output format is hard to understand\n\nIf the tool call fails, the error trace logged in memory can help the LLM reverse engineer the tool to fix the errors. But why leave it with so much heavy lifting to do?\n\nA better way to build this tool would have been the following:\n```python\n@tool\ndef get_weather_api(location: str, date_time: str) -> str:\n    \"\"\"\n    Returns the weather report.\n\n    Args:\n        location: the name of the place that you want the weather for. Should be a place name, followed by possibly a city name, then a country, like \"Anchor Point, Taghazout, Morocco\".\n        date_time: the date and time for which you want the report, formatted as '%m/%d/%y %H:%M:%S'.\n    \"\"\"\n    lon, lat = convert_location_to_coordinates(location)\n    try:\n        date_time = datetime.strptime(date_time)\n    except Exception as e:\n        raise ValueError(\"Conversion of `date_time` to datetime format failed, make sure to provide a string in format '%m/%d/%y %H:%M:%S'. Full trace:\" + str(e))\n    temperature_celsius, risk_of_rain, wave_height = get_weather_report_at_coordinates((lon, lat), date_time)\n    return f\"Weather report for {location}, {date_time}: Temperature will be {temperature_celsius}°C, risk of rain is {risk_of_rain*100:.0f}%, wave height is {wave_height}m.\"\n```\n\nIn general, to ease the load on your LLM, the good question to ask yourself is: \"How easy would it be for me, if I was dumb and using this tool for the first time ever, to program with this tool and correct my own errors?\".\n\n### Give more arguments to the agent\n\nTo pass some additional objects to your agent beyond the simple string describing the task, you can use the `additional_args` argument to pass any type of object:\n\n```py\nfrom smolagents import CodeAgent, InferenceClientModel\n\nmodel_id = \"meta-llama/Llama-3.3-70B-Instruct\"\n\nagent = CodeAgent(tools=[], model=InferenceClientModel(model_id=model_id), add_base_tools=True)\n\nagent.run(\n    \"Why does Mike not know many people in New York?\",\n    additional_args={\"mp3_sound_file_url\":'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/recording.mp3'}\n)\n```\nFor instance, you can use this `additional_args` argument to pass images or strings that you want your agent to leverage.\n\n\n\n## How to debug your agent\n\n### 1. Use a stronger LLM\n\nIn an agentic workflows, some of the errors are actual errors, some other are the fault of your LLM engine not reasoning properly.\nFor instance, consider this trace for an `CodeAgent` that I asked to create a car picture:\n```\n==================================================================================================== New task ====================================================================================================\nMake me a cool car picture\n──────────────────────────────────────────────────────────────────────────────────────────────────── New step ────────────────────────────────────────────────────────────────────────────────────────────────────\nAgent is executing the code below: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\nimage_generator(prompt=\"A cool, futuristic sports car with LED headlights, aerodynamic design, and vibrant color, high-res, photorealistic\")\n──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\n\nLast output from code snippet: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\n/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png\nStep 1:\n\n- Time taken: 16.35 seconds\n- Input tokens: 1,383\n- Output tokens: 77\n──────────────────────────────────────────────────────────────────────────────────────────────────── New step ────────────────────────────────────────────────────────────────────────────────────────────────────\nAgent is executing the code below: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\nfinal_answer(\"/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png\")\n──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\nPrint outputs:\n\nLast output from code snippet: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\n/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png\nFinal answer:\n/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png\n```\nThe user sees, instead of an image being returned, a path being returned to them.\nIt could look like a bug from the system, but actually the agentic system didn't cause the error: it's just that the LLM brain did the mistake of not saving the image output into a variable.\nThus it cannot access the image again except by leveraging the path that was logged while saving the image, so it returns the path instead of an image.\n\nThe first step to debugging your agent is thus \"Use a more powerful LLM\". Alternatives like `Qwen2/5-72B-Instruct` wouldn't have made that mistake.\n\n### 2. Provide more information or specific instructions\n\nYou can also use less powerful models, provided you guide them more effectively.\n\nPut yourself in the shoes of your model: if you were the model solving the task, would you struggle with the information available to you (from the system prompt + task formulation + tool description) ?\n\nWould you need detailed instructions?\n\n- If the instruction is to always be given to the agent (as we generally understand a system prompt to work): you can pass it as a string under argument `instructions` upon agent initialization. *(Note: instructions are appended to the system prompt, not replacing it.)*\n- If it's about a specific task to solve: add all these details to the task. The task could be very long, like dozens of pages.\n- If it's about how to use specific tools: include it in the `description` attribute of these tools.\n\n\n### 3. Change the prompt templates (generally not advised)\n\nIf above clarifications are not sufficient, you can change the agent's prompt templates.\n\nLet's see how it works. For example, let us check the default prompt templates for the [`CodeAgent`] (below version is shortened by skipping zero-shot examples).\n\n```python\nprint(agent.prompt_templates[\"system_prompt\"])\n```\nHere is what you get:\n```text\nYou are an expert assistant who can solve any task using code blobs. You will be given a task to solve as best you can.\nTo do so, you have been given access to a list of tools: these tools are basically Python functions which you can call with code.\nTo solve the task, you must plan forward to proceed in a series of steps, in a cycle of Thought, Code, and Observation sequences.\n\nAt each step, in the 'Thought:' sequence, you should first explain your reasoning towards solving the task and the tools that you want to use.\nThen in the Code sequence you should write the code in simple Python. The code sequence must be opened with '{{code_block_opening_tag}}', and closed with '{{code_block_closing_tag}}'.\nDuring each intermediate step, you can use 'print()' to save whatever important information you will then need.\nThese print outputs will then appear in the 'Observation:' field, which will be available as input for the next step.\nIn the end you have to return a final answer using the `final_answer` tool.\n\nHere are a few examples using notional tools:\n---\nTask: \"Generate an image of the oldest person in this document.\"\n\nThought: I will proceed step by step and use the following tools: `document_qa` to find the oldest person in the document, then `image_generator` to generate an image according to the answer.\n{{code_block_opening_tag}}\nanswer = document_qa(document=document, question=\"Who is the oldest person mentioned?\")\nprint(answer)\n{{code_block_closing_tag}}\nObservation: \"The oldest person in the document is John Doe, a 55 year old lumberjack living in Newfoundland.\"\n\nThought: I will now generate an image showcasing the oldest person.\n{{code_block_opening_tag}}\nimage = image_generator(\"A portrait of John Doe, a 55-year-old man living in Canada.\")\nfinal_answer(image)\n{{code_block_closing_tag}}\n\n---\nTask: \"What is the result of the following operation: 5 + 3 + 1294.678?\"\n\nThought: I will use python code to compute the result of the operation and then return the final answer using the `final_answer` tool\n{{code_block_opening_tag}}\nresult = 5 + 3 + 1294.678\nfinal_answer(result)\n{{code_block_closing_tag}}\n\n---\nTask:\n\"Answer the question in the variable `question` about the image stored in the variable `image`. The question is in French.\nYou have been provided with these additional arguments, that you can access using the keys as variables in your python code:\n{'question': 'Quel est l'animal sur l'image?', 'image': 'path/to/image.jpg'}\"\n\nThought: I will use the following tools: `translator` to translate the question into English and then `image_qa` to answer the question on the input image.\n{{code_block_opening_tag}}\ntranslated_question = translator(question=question, src_lang=\"French\", tgt_lang=\"English\")\nprint(f\"The translated question is {translated_question}.\")\nanswer = image_qa(image=image, question=translated_question)\nfinal_answer(f\"The answer is {answer}\")\n{{code_block_closing_tag}}\n\n---\nTask:\nIn a 1979 interview, Stanislaus Ulam discusses with Martin Sherwin about other great physicists of his time, including Oppenheimer.\nWhat does he say was the consequence of Einstein learning too much math on his creativity, in one word?\n\nThought: I need to find and read the 1979 interview of Stanislaus Ulam with Martin Sherwin.\n{{code_block_opening_tag}}\npages = web_search(query=\"1979 interview Stanislaus Ulam Martin Sherwin physicists Einstein\")\nprint(pages)\n{{code_block_closing_tag}}\nObservation:\nNo result found for query \"1979 interview Stanislaus Ulam Martin Sherwin physicists Einstein\".\n\nThought: The query was maybe too restrictive and did not find any results. Let's try again with a broader query.\n{{code_block_opening_tag}}\npages = web_search(query=\"1979 interview Stanislaus Ulam\")\nprint(pages)\n{{code_block_closing_tag}}\nObservation:\nFound 6 pages:\n[Stanislaus Ulam 1979 interview](https://ahf.nuclearmuseum.org/voices/oral-histories/stanislaus-ulams-interview-1979/)\n\n[Ulam discusses Manhattan Project](https://ahf.nuclearmuseum.org/manhattan-project/ulam-manhattan-project/)\n\n(truncated)\n\nThought: I will read the first 2 pages to know more.\n{{code_block_opening_tag}}\nfor url in [\"https://ahf.nuclearmuseum.org/voices/oral-histories/stanislaus-ulams-interview-1979/\", \"https://ahf.nuclearmuseum.org/manhattan-project/ulam-manhattan-project/\"]:\n    whole_page = visit_webpage(url)\n    print(whole_page)\n    print(\"\\n\" + \"=\"*80 + \"\\n\")  # Print separator between pages\n{{code_block_closing_tag}}\nObservation:\nManhattan Project Locations:\nLos Alamos, NM\nStanislaus Ulam was a Polish-American mathematician. He worked on the Manhattan Project at Los Alamos and later helped design the hydrogen bomb. In this interview, he discusses his work at\n(truncated)\n\nThought: I now have the final answer: from the webpages visited, Stanislaus Ulam says of Einstein: \"He learned too much mathematics and sort of diminished, it seems to me personally, it seems to me his purely physics creativity.\" Let's answer in one word.\n{{code_block_opening_tag}}\nfinal_answer(\"diminished\")\n{{code_block_closing_tag}}\n\n---\nTask: \"Which city has the highest population: Guangzhou or Shanghai?\"\n\nThought: I need to get the populations for both cities and compare them: I will use the tool `web_search` to get the population of both cities.\n{{code_block_opening_tag}}\nfor city in [\"Guangzhou\", \"Shanghai\"]:\n    print(f\"Population {city}:\", web_search(f\"{city} population\")\n{{code_block_closing_tag}}\nObservation:\nPopulation Guangzhou: ['Guangzhou has a population of 15 million inhabitants as of 2021.']\nPopulation Shanghai: '26 million (2019)'\n\nThought: Now I know that Shanghai has the highest population.\n{{code_block_opening_tag}}\nfinal_answer(\"Shanghai\")\n{{code_block_closing_tag}}\n\n---\nTask: \"What is the current age of the pope, raised to the power 0.36?\"\n\nThought: I will use the tool `wikipedia_search` to get the age of the pope, and confirm that with a web search.\n{{code_block_opening_tag}}\npope_age_wiki = wikipedia_search(query=\"current pope age\")\nprint(\"Pope age as per wikipedia:\", pope_age_wiki)\npope_age_search = web_search(query=\"current pope age\")\nprint(\"Pope age as per google search:\", pope_age_search)\n{{code_block_closing_tag}}\nObservation:\nPope age: \"The pope Francis is currently 88 years old.\"\n\nThought: I know that the pope is 88 years old. Let's compute the result using python code.\n{{code_block_opening_tag}}\npope_current_age = 88 ** 0.36\nfinal_answer(pope_current_age)\n{{code_block_closing_tag}}\n\nAbove example were using notional tools that might not exist for you. On top of performing computations in the Python code snippets that you create, you only have access to these tools, behaving like regular python functions:\n{{code_block_opening_tag}}\n{%- for tool in tools.values() %}\n{{ tool.to_code_prompt() }}\n{% endfor %}\n{{code_block_closing_tag}}\n\n{%- if managed_agents and managed_agents.values() | list %}\nYou can also give tasks to team members.\nCalling a team member works similarly to calling a tool: provide the task description as the 'task' argument. Since this team member is a real human, be as detailed and verbose as necessary in your task description.\nYou can also include any relevant variables or context using the 'additional_args' argument.\nHere is a list of the team members that you can call:\n{{code_block_opening_tag}}\n{%- for agent in managed_agents.values() %}\ndef {{ agent.name }}(task: str, additional_args: dict[str, Any]) -> str:\n    \"\"\"{{ agent.description }}\n\n    Args:\n        task: Long detailed description of the task.\n        additional_args: Dictionary of extra inputs to pass to the managed agent, e.g. images, dataframes, or any other contextual data it may need.\n    \"\"\"\n{% endfor %}\n{{code_block_closing_tag}}\n{%- endif %}\n\nHere are the rules you should always follow to solve your task:\n1. Always provide a 'Thought:' sequence, and a '{{code_block_opening_tag}}' sequence ending with '{{code_block_closing_tag}}', else you will fail.\n2. Use only variables that you have defined!\n3. Always use the right arguments for the tools. DO NOT pass the arguments as a dict as in 'answer = wikipedia_search({'query': \"What is the place where James Bond lives?\"})', but use the arguments directly as in 'answer = wikipedia_search(query=\"What is the place where James Bond lives?\")'.\n4. For tools WITHOUT JSON output schema: Take care to not chain too many sequential tool calls in the same code block, as their output format is unpredictable. For instance, a call to wikipedia_search without a JSON output schema has an unpredictable return format, so do not have another tool call that depends on its output in the same block: rather output results with print() to use them in the next block.\n5. For tools WITH JSON output schema: You can confidently chain multiple tool calls and directly access structured output fields in the same code block! When a tool has a JSON output schema, you know exactly what fields and data types to expect, allowing you to write robust code that directly accesses the structured response (e.g., result['field_name']) without needing intermediate print() statements.\n6. Call a tool only when needed, and never re-do a tool call that you previously did with the exact same parameters.\n7. Don't name any new variable with the same name as a tool: for instance don't name a variable 'final_answer'.\n8. Never create any notional variables in our code, as having these in your logs will derail you from the true variables.\n9. You can use imports in your code, but only from the following list of modules: {{authorized_imports}}\n10. The state persists between code executions: so if in one step you've created variables or imported modules, these will all persist.\n11. Don't give up! You're in charge of solving the task, not providing directions to solve it.\n\n{%- if custom_instructions %}\n{{custom_instructions}}\n{%- endif %}\n\nNow Begin!\n```\n\nAs you can see, there are placeholders like `\"{{ tool.description }}\"`: these will be used upon agent initialization to insert certain automatically generated descriptions of tools or managed agents.\n\nSo while you can overwrite this system prompt template by passing your custom prompt as an argument to the `system_prompt` parameter, your new system prompt can contain the following placeholders:\n- To insert tool descriptions:\n  ```\n  {%- for tool in tools.values() %}\n  - {{ tool.to_tool_calling_prompt() }}\n  {%- endfor %}\n  ```\n- To insert the descriptions for managed agents if there are any:\n  ```\n  {%- if managed_agents and managed_agents.values() | list %}\n  You can also give tasks to team members.\n  Calling a team member works similarly to calling a tool: provide the task description as the 'task' argument. Since this team member is a real human, be as detailed and verbose as necessary in your task description.\n  You can also include any relevant variables or context using the 'additional_args' argument.\n  Here is a list of the team members that you can call:\n  {%- for agent in managed_agents.values() %}\n  - {{ agent.name }}: {{ agent.description }}\n  {%- endfor %}\n  {%- endif %}\n  ```\n- For `CodeAgent` only, to insert the list of authorized imports: `\"{{authorized_imports}}\"`\n\nThen you can change the system prompt as follows:\n\n```py\nagent.prompt_templates[\"system_prompt\"] = agent.prompt_templates[\"system_prompt\"] + \"\\nHere you go!\"\n```\n\nThis also works with the [`ToolCallingAgent`].\n\nBut generally it's just simpler to pass argument `instructions` upon agent initalization, like:\n```py\nagent = CodeAgent(tools=[], model=InferenceClientModel(model_id=model_id), instructions=\"Always talk like a 5 year old.\")\n```\n\nNote that `instructions` are appended to the system prompt, not replacing it.\n\n\n### 4. Extra planning\n\nWe provide a model for a supplementary planning step, that an agent can run regularly in-between normal action steps. In this step, there is no tool call, the LLM is simply asked to update a list of facts it knows and to reflect on what steps it should take next based on those facts.\n\n```py\nfrom smolagents import load_tool, CodeAgent, InferenceClientModel, WebSearchTool\nfrom dotenv import load_dotenv\n\nload_dotenv()\n\n# Import tool from Hub\nimage_generation_tool = load_tool(\"m-ric/text-to-image\", trust_remote_code=True)\n\nsearch_tool = WebSearchTool()\n\nagent = CodeAgent(\n    tools=[search_tool, image_generation_tool],\n    model=InferenceClientModel(model_id=\"Qwen/Qwen2.5-72B-Instruct\"),\n    planning_interval=3 # This is where you activate planning!\n)\n\n# Run it!\nresult = agent.run(\n    \"How long would a cheetah at full speed take to run the length of Pont Alexandre III?\",\n)\n```\n"
  },
  {
    "path": "docs/source/en/tutorials/inspect_runs.md",
    "content": "# Inspecting runs with OpenTelemetry\n\n[[open-in-colab]]\n\n> [!TIP]\n> If you're new to building agents, make sure to first read the [intro to agents](../conceptual_guides/intro_agents) and the [guided tour of smolagents](../guided_tour).\n\n## Why log your agent runs?\n\nAgent runs are complicated to debug.\n\nValidating that a run went properly is hard, since agent workflows are [unpredictable by design](../conceptual_guides/intro_agents) (if they were predictable, you'd just be using good old code). \n\nAnd inspecting a run is hard as well: multi-step agents tend to quickly fill a console with logs, and most of the errors are just \"LLM dumb\" kind of errors, from which the LLM auto-corrects in the next step by writing better code or tool calls.\n\nSo using instrumentation to record agent runs is necessary in production for later inspection and monitoring!\n\nWe've adopted the [OpenTelemetry](https://opentelemetry.io/) standard for instrumenting agent runs.\n\nThis means that you can just run some instrumentation code, then run your agents normally, and everything gets logged into your platform. Below are some examples of how to do this with different OpenTelemetry backends.\n\nHere's how it then looks like on the platform:\n\n<div class=\"flex justify-center\">\n    <img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/inspect_run_phoenix.gif\"/>\n</div>\n\n\n## Setting up telemetry with Arize AI Phoenix\nFirst install the required packages. Here we install [Phoenix by Arize AI](https://github.com/Arize-ai/phoenix) because that's a good solution to collect and inspect the logs, but there are other OpenTelemetry-compatible platforms that you could use for this collection & inspection part.\n\n```shell\npip install 'smolagents[telemetry,toolkit]'\n```\n\nThen run the collector in the background.\n\n```shell\npython -m phoenix.server.main serve\n```\n\nFinally, set up `SmolagentsInstrumentor` to trace your agents and send the traces to Phoenix default endpoint.\n\n```python\nfrom phoenix.otel import register\nfrom openinference.instrumentation.smolagents import SmolagentsInstrumentor\n\nregister()\nSmolagentsInstrumentor().instrument()\n```\nThen you can run your agents!\n\n```py\nfrom smolagents import (\n    CodeAgent,\n    ToolCallingAgent,\n    WebSearchTool,\n    VisitWebpageTool,\n    InferenceClientModel,\n)\n\nmodel = InferenceClientModel()\n\nsearch_agent = ToolCallingAgent(\n    tools=[WebSearchTool(), VisitWebpageTool()],\n    model=model,\n    name=\"search_agent\",\n    description=\"This is an agent that can do web search.\",\n)\n\nmanager_agent = CodeAgent(\n    tools=[],\n    model=model,\n    managed_agents=[search_agent],\n)\nmanager_agent.run(\n    \"If the US keeps its 2024 growth rate, how many years will it take for the GDP to double?\"\n)\n```\nVoilà!\nYou can then navigate to `http://0.0.0.0:6006/projects/` to inspect your run!\n\n<img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/inspect_run_phoenix.png\">\n\nYou can see that the CodeAgent called its managed ToolCallingAgent (by the way, the managed agent could have been a CodeAgent as well) to ask it to run the web search for the U.S. 2024 growth rate. Then the managed agent returned its report and the manager agent acted upon it to calculate the economy doubling time! Sweet, isn't it?\n\n## Setting up telemetry with MLflow\n\nMLflow has one-line autologging for Smolagents: it tracks runs, spans, inputs/outputs, and token usage in the MLflow UI.\n\nInstall MLflow, enable autologging, then run your agent with a couple of tools:\n\n```python\n%pip install mlflow smolagents\n\nimport mlflow\nfrom smolagents import CodeAgent, ToolCallingAgent, WebSearchTool, VisitWebpageTool, InferenceClientModel\n\nmlflow.smolagents.autolog()  # start tracing everything below\n\nmodel = InferenceClientModel()\nbrowser = ToolCallingAgent(\n    tools=[WebSearchTool(), VisitWebpageTool()],\n    model=model,\n    name=\"search_agent\",\n    description=\"Web search helper\",\n)\nmanager = CodeAgent(model=model, managed_agents=[browser])\nmanager.run(\"Find the latest US GDP growth rate and estimate when it would double.\")\n```\n\nStart the UI to inspect traces, then open the Traces view in your browser:\n\n```shell\nmlflow server --port 5000\n```\n\n## Setting up telemetry with 🪢 Langfuse\n\nThis part shows how to monitor and debug your Hugging Face **smolagents** with **Langfuse** using the `SmolagentsInstrumentor`.\n\n> **What is Langfuse?** [Langfuse](https://langfuse.com) is an open-source platform for LLM engineering. It provides tracing and monitoring capabilities for AI agents, helping developers debug, analyze, and optimize their products. Langfuse integrates with various tools and frameworks via native integrations, OpenTelemetry, and SDKs.\n\n### Step 1: Install Dependencies\n\n```python\n%pip install langfuse 'smolagents[telemetry]' openinference-instrumentation-smolagents\n```\n\n### Step 2: Set Up Environment Variables\n\nSet your Langfuse API keys and configure the OpenTelemetry endpoint to send traces to Langfuse. Get your Langfuse API keys by signing up for [Langfuse Cloud](https://cloud.langfuse.com) or [self-hosting Langfuse](https://langfuse.com/self-hosting).\n\nAlso, add your [Hugging Face token](https://huggingface.co/settings/tokens) (`HF_TOKEN`) as an environment variable.\n\n```python\nimport os\n# Get keys for your project from the project settings page: https://cloud.langfuse.com\nos.environ[\"LANGFUSE_PUBLIC_KEY\"] = \"pk-lf-...\" \nos.environ[\"LANGFUSE_SECRET_KEY\"] = \"sk-lf-...\" \nos.environ[\"LANGFUSE_HOST\"] = \"https://cloud.langfuse.com\" # 🇪🇺 EU region\n# os.environ[\"LANGFUSE_HOST\"] = \"https://us.cloud.langfuse.com\" # 🇺🇸 US region\n \n# your Hugging Face token\nos.environ[\"HF_TOKEN\"] = \"hf_...\"\n```\n\nWith the environment variables set, we can now initialize the Langfuse client. `get_client()` initializes the Langfuse client using the credentials provided in the environment variables.\n\n```python\nfrom langfuse import get_client\n \nlangfuse = get_client()\n \n# Verify connection\nif langfuse.auth_check():\n    print(\"Langfuse client is authenticated and ready!\")\nelse:\n    print(\"Authentication failed. Please check your credentials and host.\")\n```\n\n### Step 3: Initialize the `SmolagentsInstrumentor`\n\nInitialize the `SmolagentsInstrumentor` before your application code. \n\n\n```python\nfrom openinference.instrumentation.smolagents import SmolagentsInstrumentor\n \nSmolagentsInstrumentor().instrument()\n```\n\n### Step 4: Run your smolagent\n\n```python\nfrom smolagents import (\n    CodeAgent,\n    ToolCallingAgent,\n    WebSearchTool,\n    VisitWebpageTool,\n    InferenceClientModel,\n)\n\nmodel = InferenceClientModel(\n    model_id=\"deepseek-ai/DeepSeek-R1-Distill-Qwen-32B\"\n)\n\nsearch_agent = ToolCallingAgent(\n    tools=[WebSearchTool(), VisitWebpageTool()],\n    model=model,\n    name=\"search_agent\",\n    description=\"This is an agent that can do web search.\",\n)\n\nmanager_agent = CodeAgent(\n    tools=[],\n    model=model,\n    managed_agents=[search_agent],\n)\nmanager_agent.run(\n    \"How can Langfuse be used to monitor and improve the reasoning and decision-making of smolagents when they execute multi-step tasks, like dynamically adjusting a recipe based on user feedback or available ingredients?\"\n)\n```\n\n### Step 5: View Traces in Langfuse\n\nAfter running the agent, you can view the traces generated by your smolagents application in [Langfuse](https://cloud.langfuse.com). You should see detailed steps of the LLM interactions, which can help you debug and optimize your AI agent.\n\n![smolagents example trace](https://langfuse.com/images/cookbook/integration-smolagents/smolagent_example_trace.png)\n\n_[Public example trace in Langfuse](https://cloud.langfuse.com/project/cloramnkj0002jz088vzn1ja4/traces/ce5160f9bfd5a6cd63b07d2bfcec6f54?timestamp=2025-02-11T09%3A25%3A45.163Z&display=details)_\n"
  },
  {
    "path": "docs/source/en/tutorials/memory.md",
    "content": "# 📚 Manage your agent's memory\n\n[[open-in-colab]]\n\nIn the end, an agent can be defined by simple components: it has tools, prompts.\nAnd most importantly, it has a memory of past steps, drawing a history of planning, execution, and errors.\n\n### Replay your agent's memory\n\nWe propose several features to inspect a past agent run.\n\nYou can instrument the agent's run to display it in a great UI that lets you zoom in/out on specific steps, as highlighted in the [instrumentation guide](./inspect_runs).\n\nYou can also use `agent.replay()`, as follows:\n\nAfter the agent has run:\n```py\nfrom smolagents import InferenceClientModel, CodeAgent\n\nagent = CodeAgent(tools=[], model=InferenceClientModel(), verbosity_level=0)\n\nresult = agent.run(\"What's the 20th Fibonacci number?\")\n```\n\nIf you want to replay this last run, just use:\n```py\nagent.replay()\n```\n\n### Dynamically change the agent's memory\n\nMany advanced use cases require dynamic modification of the agent's memory.\n\nYou can access the agent's memory using:\n\n```py\nfrom smolagents import ActionStep\n\nsystem_prompt_step = agent.memory.system_prompt\nprint(\"The system prompt given to the agent was:\")\nprint(system_prompt_step.system_prompt)\n\ntask_step = agent.memory.steps[0]\nprint(\"\\n\\nThe first task step was:\")\nprint(task_step.task)\n\nfor step in agent.memory.steps:\n    if isinstance(step, ActionStep):\n        if step.error is not None:\n            print(f\"\\nStep {step.step_number} got this error:\\n{step.error}\\n\")\n        else:\n            print(f\"\\nStep {step.step_number} got these observations:\\n{step.observations}\\n\")\n```\n\nUse `agent.memory.get_full_steps()` to get full steps as dictionaries.\n\nYou can also use step callbacks to dynamically change the agent's memory.\n\nStep callbacks can access the `agent` itself in their arguments, so they can access any memory step as highlighted above, and change it if needed. For instance, let's say you are observing screenshots of each step performed by a web browser agent. You want to log the newest screenshot, and remove the images from ancient steps to save on token costs.\n\nYou could run something like the following.\n_Note: this code is incomplete, some imports and object definitions have been removed for the sake of concision, visit [the original script](https://github.com/huggingface/smolagents/blob/main/src/smolagents/vision_web_browser.py) to get the full working code._\n\n```py\nimport helium\nfrom PIL import Image\nfrom io import BytesIO\nfrom time import sleep\n\ndef update_screenshot(memory_step: ActionStep, agent: CodeAgent) -> None:\n    sleep(1.0)  # Let JavaScript animations happen before taking the screenshot\n    driver = helium.get_driver()\n    latest_step = memory_step.step_number\n    for previous_memory_step in agent.memory.steps:  # Remove previous screenshots from logs for lean processing\n        if isinstance(previous_memory_step, ActionStep) and previous_memory_step.step_number <= latest_step - 2:\n            previous_memory_step.observations_images = None\n    png_bytes = driver.get_screenshot_as_png()\n    image = Image.open(BytesIO(png_bytes))\n    memory_step.observations_images = [image.copy()]\n```\n\nThen you should pass this function in the `step_callbacks` argument upon initialization of your agent:\n\n```py\nCodeAgent(\n    tools=[WebSearchTool(), go_back, close_popups, search_item_ctrl_f],\n    model=model,\n    additional_authorized_imports=[\"helium\"],\n    step_callbacks=[update_screenshot],\n    max_steps=20,\n    verbosity_level=2,\n)\n```\n\nHead to our [vision web browser code](https://github.com/huggingface/smolagents/blob/main/src/smolagents/vision_web_browser.py) to see the full working example.\n\n### Run agents one step at a time\n\nThis can be useful in case you have tool calls that take days: you can just run your agents step by step.\nThis will also let you update the memory on each step.\n\n```py\nfrom smolagents import InferenceClientModel, CodeAgent, ActionStep, TaskStep\n\nagent = CodeAgent(tools=[], model=InferenceClientModel(), verbosity_level=1)\nagent.python_executor.send_tools({**agent.tools})\nprint(agent.memory.system_prompt)\n\ntask = \"What is the 20th Fibonacci number?\"\n\n# You could modify the memory as needed here by inputting the memory of another agent.\n# agent.memory.steps = previous_agent.memory.steps\n\n# Let's start a new task!\nagent.memory.steps.append(TaskStep(task=task, task_images=[]))\n\nfinal_answer = None\nstep_number = 1\nwhile final_answer is None and step_number <= 10:\n    memory_step = ActionStep(\n        step_number=step_number,\n        observations_images=[],\n    )\n    # Run one step.\n    final_answer = agent.step(memory_step)\n    agent.memory.steps.append(memory_step)\n    step_number += 1\n\n    # Change the memory as you please!\n    # For instance to update the latest step:\n    # agent.memory.steps[-1] = ...\n\nprint(\"The final answer is:\", final_answer)\n```\n"
  },
  {
    "path": "docs/source/en/tutorials/secure_code_execution.md",
    "content": "# Secure code execution\n\n[[open-in-colab]]\n\n> [!TIP]\n> If you're new to building agents, make sure to first read the [intro to agents](../conceptual_guides/intro_agents) and the [guided tour of smolagents](../guided_tour).\n\n### Code agents\n\n[Multiple](https://huggingface.co/papers/2402.01030) [research](https://huggingface.co/papers/2411.01747) [papers](https://huggingface.co/papers/2401.00812) have shown that having the LLM write its actions (the tool calls) in code is much better than the current standard format for tool calling, which is across the industry different shades of \"writing actions as a JSON of tools names and arguments to use\".\n\nWhy is code better? Well, because we crafted our code languages specifically to be great at expressing actions performed by a computer. If JSON snippets were a better way, this package would have been written in JSON snippets and the devil would be laughing at us.\n\nCode is just a better way to express actions on a computer. It has better:\n- **Composability:** could you nest JSON actions within each other, or define a set of JSON actions to re-use later, the same way you could just define a python function?\n- **Object management:** how do you store the output of an action like `generate_image` in JSON?\n- **Generality:** code is built to express simply anything you can have a computer do.\n- **Representation in LLM training corpus:** why not leverage this benediction of the sky that plenty of quality actions have already been included in LLM training corpus?\n\nThis is illustrated on the figure below, taken from [Executable Code Actions Elicit Better LLM Agents](https://huggingface.co/papers/2402.01030).\n\n<img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/code_vs_json_actions.png\">\n\nThis is why we put emphasis on proposing code agents, in this case python agents, which meant putting higher effort on building secure python interpreters.\n\n### Local code execution??\n\nBy default, the `CodeAgent` runs LLM-generated code in your environment.\n\nThis is inherently risky, LLM-generated code could be harmful to your environment.\n\nMalicious code execution can occur in several ways:\n- **Plain LLM error:** LLMs are still far from perfect and may unintentionally generate harmful commands while attempting to be helpful. While this risk is low, instances have been observed where an LLM attempted to execute potentially dangerous code.  \n- **Supply chain attack:** Running an untrusted or compromised LLM could expose a system to harmful code generation. While this risk is extremely low when using well-known models on secure inference infrastructure, it remains a theoretical possibility.  \n- **Prompt injection:** an agent browsing the web could arrive on a malicious website that contains harmful instructions, thus injecting an attack into the agent's memory\n- **Exploitation of publicly accessible agents:** Agents exposed to the public can be misused by malicious actors to execute harmful code. Attackers may craft adversarial inputs to exploit the agent's execution capabilities, leading to unintended consequences.\nOnce malicious code is executed, whether accidentally or intentionally, it can damage the file system, exploit local or cloud-based resources, abuse API services, and even compromise network security.\n\nOne could argue that on the [spectrum of agency](../conceptual_guides/intro_agents), code agents give much higher agency to the LLM on your system than other less agentic setups: this goes hand-in-hand with higher risk.\n\nSo you need to be very mindful of security.\n\nTo improve safety, we propose a range of measures that propose elevated levels of security, at a higher setup cost.\n\nWe advise you to keep in mind that no solution will be 100% safe.\n\n<img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/code_execution_safety_diagram.png\">\n\n### Our local Python executor\n\nTo add a first layer of security, code execution in `smolagents` is not performed by the vanilla Python interpreter.\nWe have re-built a more secure `LocalPythonExecutor` from the ground up.\n\nTo be precise, this interpreter works by loading the Abstract Syntax Tree (AST) from your Code and executes it operation by operation, making sure to always follow certain rules:\n- By default, imports are disallowed unless they have been explicitly added to an authorization list by the user.\n- Furthermore, access to submodules is disabled by default, and each must be explicitly authorized in the import list as well, or you can pass for instance `numpy.*` to allow both `numpy` and all its subpackags, like `numpy.random` or `numpy.a.b`.\n   - Note that some seemingly innocuous packages like `random` can give access to potentially harmful submodules, as in `random._os`.\n- The total count of elementary operations processed is capped to prevent infinite loops and resource bloating.\n- Any operation that has not been explicitly defined in our custom interpreter will raise an error.\n\nYou could try these safeguards as follows:\n\n```py\nfrom smolagents.local_python_executor import LocalPythonExecutor\n\n# Set up custom executor, authorize package \"numpy\"\ncustom_executor = LocalPythonExecutor([\"numpy\"])\n\n# Utility for pretty printing errors\ndef run_capture_exception(command: str):\n    try:\n        custom_executor(harmful_command)\n    except Exception as e:\n        print(\"ERROR:\\n\", e)\n\n# Undefined command just do not work\nharmful_command=\"!echo Bad command\"\nrun_capture_exception(harmful_command)\n# >>> ERROR: invalid syntax (<unknown>, line 1)\n\n\n# Imports like os will not be performed unless explicitly added to `additional_authorized_imports`\nharmful_command=\"import os; exit_code = os.system('echo Bad command')\"\nrun_capture_exception(harmful_command)\n# >>> ERROR: Code execution failed at line 'import os' due to: InterpreterError: Import of os is not allowed. Authorized imports are: ['statistics', 'numpy', 'itertools', 'time', 'queue', 'collections', 'math', 'random', 're', 'datetime', 'stat', 'unicodedata']\n\n# Even in authorized imports, potentially harmful packages will not be imported\nharmful_command=\"import random; random._os.system('echo Bad command')\"\nrun_capture_exception(harmful_command)\n# >>> ERROR: Code execution failed at line 'random._os.system('echo Bad command')' due to: InterpreterError: Forbidden access to module: os\n\n# Infinite loop are interrupted after N operations\nharmful_command=\"\"\"\nwhile True:\n    pass\n\"\"\"\nrun_capture_exception(harmful_command)\n# >>> ERROR: Code execution failed at line 'while True: pass' due to: InterpreterError: Maximum number of 1000000 iterations in While loop exceeded\n```\n\nThese safeguards make out interpreter is safer.\nWe have used it on a diversity of use cases, without ever observing any damage to the environment.\n\n> [!WARNING]\n> It's important to understand that no local python sandbox can ever be completely secure. While our interpreter provides significant safety improvements over the standard Python interpreter, it is still possible for a determined attacker or a fine-tuned malicious LLM to find vulnerabilities and potentially harm your environment. \n> \n> For example, if you've allowed packages like `Pillow` to process images, the LLM could generate code that creates thousands of large image files to fill your hard drive. Other advanced escape techniques might exploit deeper vulnerabilities in authorized packages.\n> \n> Running LLM-generated code in your local environment always carries some inherent risk. The only way to run LLM-generated code with truly robust security isolation is to use remote execution options like E2B or Docker, as detailed below.\n\nThe risk of a malicious attack is low when using well-known LLMs from trusted inference providers, but it is not zero.\nFor high-security applications or when using less trusted models, you should consider using a remote execution sandbox.\n\n## Sandbox approaches for secure code execution\n\nWhen working with AI agents that execute code, security is paramount. There are two main approaches to sandboxing code execution in smolagents, each with different security properties and capabilities:\n\n\n![Sandbox approaches comparison](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/sandboxed_execution.png)\n\n1. **Running individual code snippets in a sandbox**: This approach (left side of diagram) only executes the agent-generated Python code snippets in a sandbox while keeping the rest of the agentic system in your local environment. It's simpler to set up using `executor_type=\"blaxel\"`, `executor_type=\"e2b\"`, `executor_type=\"modal\"`, or\n`executor_type=\"docker\"`, but it doesn't support multi-agents and still requires passing state data between your environment and the sandbox.\n\n2. **Running the entire agentic system in a sandbox**: This approach (right side of diagram) runs the entire agentic system, including the agent, model, and tools, within a sandbox environment. This provides better isolation but requires more manual setup and may require passing sensitive credentials (like API keys) to the sandbox environment.\n\nThis guide describes how to set up and use both types of sandbox approaches for your agent applications.\n\n### Blaxel setup\n\n#### Installation\n\n1. Create a Blaxel account at [blaxel.ai](https://blaxel.ai)\n2. Install the required packages:\n```bash\npip install 'smolagents[blaxel]'\n```\n\n#### Running your agent with Blaxel: quick start\n\nWe provide a simple way to use a Blaxel Sandbox: simply add `executor_type=\"blaxel\"` to the agent initialization, as follows:\n\n```py\nfrom smolagents import InferenceClientModel, CodeAgent\n\nwith CodeAgent(model=InferenceClientModel(), tools=[], executor_type=\"blaxel\") as agent:\n    agent.run(\"Can you give me the 100th Fibonacci number?\")\n```\n\n> [!TIP]\n> Using the agent as a context manager (with the `with` statement) ensures that the Blaxel sandbox is cleaned up immediately after the agent completes its task.\n> Alternatively, you can manually call the agent's `cleanup()` method.\n\nThis solution sends the agent state to the server at the start of each `agent.run()`.\nThen the models are called from the local environment, but the generated code will be sent to the sandbox for execution, and only the output will be returned.\n\nBlaxel provides fast-launching virtual machines that start from hibernation in under 25ms and scale back to zero after inactivity while maintaining memory state, making it an excellent choice for agent applications that require quick, secure code execution.\n\n> [!TIP]\n> For even stronger security isolation, you can host your entire agent remotely on Blaxel. This provides complete sandboxing of the agent, model, and tools. See the [Blaxel agent hosting documentation](https://docs.blaxel.ai/Agents/Develop-an-agent-py) for details.\n\n### E2B setup\n\n#### Installation\n\n1. Create an E2B account at [e2b.dev](https://e2b.dev)\n2. Install the required packages:\n```bash\npip install 'smolagents[e2b]'\n```\n\n#### Running your agent in E2B: quick start\n\nWe provide a simple way to use an E2B Sandbox: simply add `executor_type=\"e2b\"` to the agent initialization, as follows:\n\n```py\nfrom smolagents import InferenceClientModel, CodeAgent\n\nwith CodeAgent(model=InferenceClientModel(), tools=[], executor_type=\"e2b\") as agent:\n    agent.run(\"Can you give me the 100th Fibonacci number?\")\n```\n\n> [!TIP]\n> Using the agent as a context manager (with the `with` statement) ensures that the E2B sandbox is cleaned up immediately after the agent completes its task.\n> Alternatively, you can manually call the agent's `cleanup()` method.\n\nThis solution send the agent state to the server at the start of each `agent.run()`.\nThen the models are called from the local environment, but the generated code will be sent to the sandbox for execution, and only the output will be returned.\n\nThis is illustrated in the figure below.\n\n<p align=\"center\">\n    <img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/sandboxed_execution.png\" alt=\"sandboxed code execution\" width=60% max-width=500px>\n</p>\n\nHowever, since any call to a [managed agent](../examples/multiagents) would require model calls, since we do not transfer secrets to the remote sandbox, the model call would lack credentials.\nHence this solution does not work (yet) with more complicated multi-agent setups.\n\n#### Running your agent in E2B: multi-agents\n\nTo use multi-agents in an E2B sandbox, you need to run your agents completely from within E2B.\n\nHere is how to do it:\n\n```python\nfrom e2b_code_interpreter import Sandbox\nimport os\n\n# Create the sandbox\nsandbox = Sandbox()\n\n# Install required packages\nsandbox.commands.run(\"pip install smolagents\")\n\ndef run_code_raise_errors(sandbox, code: str, verbose: bool = False) -> str:\n    execution = sandbox.run_code(\n        code,\n        envs={'HF_TOKEN': os.getenv('HF_TOKEN')}\n    )\n    if execution.error:\n        execution_logs = \"\\n\".join([str(log) for log in execution.logs.stdout])\n        logs = execution_logs\n        logs += execution.error.traceback\n        raise ValueError(logs)\n    return \"\\n\".join([str(log) for log in execution.logs.stdout])\n\n# Define your agent application\nagent_code = \"\"\"\nimport os\nfrom smolagents import CodeAgent, InferenceClientModel\n\n# Initialize the agents\nagent = CodeAgent(\n    model=InferenceClientModel(token=os.getenv(\"HF_TOKEN\"), provider=\"together\"),\n    tools=[],\n    name=\"coder_agent\",\n    description=\"This agent takes care of your difficult algorithmic problems using code.\"\n)\n\nmanager_agent = CodeAgent(\n    model=InferenceClientModel(token=os.getenv(\"HF_TOKEN\"), provider=\"together\"),\n    tools=[],\n    managed_agents=[agent],\n)\n\n# Run the agent\nresponse = manager_agent.run(\"What's the 20th Fibonacci number?\")\nprint(response)\n\"\"\"\n\n# Run the agent code in the sandbox\nexecution_logs = run_code_raise_errors(sandbox, agent_code)\nprint(execution_logs)\n```\n\n### Modal setup\n\n#### Installation\n\n1. Create a Modal account at [modal.com](https://modal.com/signup)\n2. Install the required packages:\n```bash\npip install 'smolagents[modal]'\n```\n\n#### Running your agent in Modal: quick start\n\nWe provide a simple way to use a Modal Sandbox: simply add `executor_type=\"modal\"` to the agent initialization, as follows:\n\n```py\nfrom smolagents import InferenceClientModel, CodeAgent\n\nwith CodeAgent(model=InferenceClientModel(), tools=[], executor_type=\"modal\") as agent:\n    agent.run(\"What is the 42th Fibonacci number?\")\n```\n\n> [!TIP]\n> Using the agent as a context manager (with the `with` statement) ensures that the Modal sandbox is cleaned immediately after the agent completes its task.\n> Alternatively, you can manually call the agent's `cleanup()` method.\n\nThe agent state and generated code from the `InferenceClientModel` are sent to a Modal sandbox, which can securely execute code inside them.\n\n### Docker setup\n\n#### Installation\n\n1. [Install Docker on your system](https://docs.docker.com/get-started/get-docker/)\n2. Install the required packages:\n```bash\npip install 'smolagents[docker]'\n```\n\n#### Running your agent in Docker: quick start\n\nSimilar to the E2B Sandbox above, to quickly get started with Docker, simply add `executor_type=\"docker\"` to the agent initialization, like:\n\n```py\nfrom smolagents import InferenceClientModel, CodeAgent\n\nwith CodeAgent(model=InferenceClientModel(), tools=[], executor_type=\"docker\") as agent:\n    agent.run(\"Can you give me the 100th Fibonacci number?\")\n```\n\n> [!TIP]\n> Using the agent as a context manager (with the `with` statement) ensures that the Docker container is cleaned immediately after the agent completes its task.\n> Alternatively, you can manually call the agent's `cleanup()` method.\n\n#### Advanced docker usage\n\nIf you want to run multi-agent systems in Docker, you'll need to setup a custom interpreter in a sandbox.\n\nHere is how to setup the a Dockerfile:\n\n```dockerfile\nFROM python:3.10-bullseye\n\n# Install build dependencies\nRUN apt-get update && \\\n    apt-get install -y --no-install-recommends \\\n        build-essential \\\n        python3-dev && \\\n    pip install --no-cache-dir --upgrade pip && \\\n    pip install --no-cache-dir smolagents && \\\n    apt-get clean && \\\n    rm -rf /var/lib/apt/lists/*\n\n# Set working directory\nWORKDIR /app\n\n# Run with limited privileges\nUSER nobody\n\n# Default command\nCMD [\"python\", \"-c\", \"print('Container ready')\"]\n```\n\nCreate a sandbox manager to run code:\n\n```python\nimport docker\nimport os\nfrom typing import Optional\n\nclass DockerSandbox:\n    def __init__(self):\n        self.client = docker.from_env()\n        self.container = None\n\n    def create_container(self):\n        try:\n            image, build_logs = self.client.images.build(\n                path=\".\",\n                tag=\"agent-sandbox\",\n                rm=True,\n                forcerm=True,\n                buildargs={},\n                # decode=True\n            )\n        except docker.errors.BuildError as e:\n            print(\"Build error logs:\")\n            for log in e.build_log:\n                if 'stream' in log:\n                    print(log['stream'].strip())\n            raise\n\n        # Create container with security constraints and proper logging\n        self.container = self.client.containers.run(\n            \"agent-sandbox\",\n            command=\"tail -f /dev/null\",  # Keep container running\n            detach=True,\n            tty=True,\n            mem_limit=\"512m\",\n            cpu_quota=50000,\n            pids_limit=100,\n            security_opt=[\"no-new-privileges\"],\n            cap_drop=[\"ALL\"],\n            environment={\n                \"HF_TOKEN\": os.getenv(\"HF_TOKEN\")\n            },\n        )\n\n    def run_code(self, code: str) -> Optional[str]:\n        if not self.container:\n            self.create_container()\n\n        # Execute code in container\n        exec_result = self.container.exec_run(\n            cmd=[\"python\", \"-c\", code],\n            user=\"nobody\"\n        )\n\n        # Collect all output\n        return exec_result.output.decode() if exec_result.output else None\n\n\n    def cleanup(self):\n        if self.container:\n            try:\n                self.container.stop()\n            except docker.errors.NotFound:\n                # Container already removed, this is expected\n                pass\n            except Exception as e:\n                print(f\"Error during cleanup: {e}\")\n            finally:\n                self.container = None  # Clear the reference\n\n# Example usage:\nsandbox = DockerSandbox()\n\ntry:\n    # Define your agent code\n    agent_code = \"\"\"\nimport os\nfrom smolagents import CodeAgent, InferenceClientModel\n\n# Initialize the agent\nagent = CodeAgent(\n    model=InferenceClientModel(token=os.getenv(\"HF_TOKEN\"), provider=\"together\"),\n    tools=[]\n)\n\n# Run the agent\nresponse = agent.run(\"What's the 20th Fibonacci number?\")\nprint(response)\n\"\"\"\n\n    # Run the code in the sandbox\n    output = sandbox.run_code(agent_code)\n    print(output)\n\nfinally:\n    sandbox.cleanup()\n```\n\n### WebAssembly setup\n\nWebAssembly (Wasm) is a binary instruction format that allows code to be run in a safe, sandboxed environment.\nIt is designed to be fast, efficient, and secure, making it an excellent choice for executing potentially untrusted code.\n\nThe `WasmExecutor` uses [Pyodide](https://pyodide.org/) and [Deno](https://docs.deno.com/).\n\n#### Installation\n\n1. [Install Deno on your system](https://docs.deno.com/runtime/getting_started/installation/)\n\n#### Running your agent in WebAssembly: quick start\n\nSimply pass `executor_type=\"wasm\"` to the agent initialization, like:\n```py\nfrom smolagents import InferenceClientModel, CodeAgent\n\nagent = CodeAgent(model=InferenceClientModel(), tools=[], executor_type=\"wasm\")\n\nagent.run(\"Can you give me the 100th Fibonacci number?\")\n```\n\n### Best practices for sandboxes\n\nThese key practices apply to Blaxel, E2B, and Docker sandboxes:\n\n- Resource management\n  - Set memory and CPU limits\n  - Implement execution timeouts\n  - Monitor resource usage\n- Security\n  - Run with minimal privileges\n  - Disable unnecessary network access\n  - Use environment variables for secrets\n- Environment\n  - Keep dependencies minimal\n  - Use fixed package versions\n  - If you use base images, update them regularly\n\n- Cleanup\n  - Always ensure proper cleanup of resources, especially for Docker containers, to avoid having dangling containers eating up resources.\n\n✨ By following these practices and implementing proper cleanup procedures, you can ensure your agent runs safely and efficiently in a sandboxed environment.\n\n## Comparing security approaches\n\nAs illustrated in the diagram earlier, both sandboxing approaches have different security implications:\n\n### Approach 1: Running just the code snippets in a sandbox\n- **Pros**: \n  - Easier to set up with a simple parameter (`executor_type=\"blaxel\"`, `executor_type=\"e2b\"`, or `executor_type=\"docker\"`)\n  - No need to transfer API keys to the sandbox\n  - Better protection for your local environment\n  - Fast execution with Blaxel's hibernation technology (<25ms startup)\n- **Cons**:\n  - Doesn't support multi-agents (managed agents)\n  - Still requires transferring state between your environment and the sandbox\n  - Limited to specific code execution\n\n### Approach 2: Running the entire agentic system in a sandbox\n- **Pros**:\n  - Supports multi-agents\n  - Complete isolation of the entire agent system\n  - More flexible for complex agent architectures\n- **Cons**:\n  - Requires more manual setup\n  - May require transferring sensitive API keys to the sandbox\n  - Potentially higher latency due to more complex operations\n\nChoose the approach that best balances your security needs with your application's requirements. For most applications with simpler agent architectures, Approach 1 provides a good balance of security and ease of use. For more complex multi-agent systems where you need full isolation, Approach 2, while more involved to set up, offers better security guarantees.\n"
  },
  {
    "path": "docs/source/en/tutorials/tools.md",
    "content": "# Tools\n\n[[open-in-colab]]\n\nHere, we're going to see advanced tool usage.\n\n> [!TIP]\n> If you're new to building agents, make sure to first read the [intro to agents](../conceptual_guides/intro_agents) and the [guided tour of smolagents](../guided_tour).\n\n\n### What is a tool, and how to build one?\n\nA tool is mostly a function that an LLM can use in an agentic system.\n\nBut to use it, the LLM will need to be given an API: name, tool description, input types and descriptions, output type.\n\nSo it cannot be only a function. It should be a class.\n\nSo at core, the tool is a class that wraps a function with metadata that helps the LLM understand how to use it.\n\nHere's how it looks:\n\n```python\nfrom smolagents import Tool\n\nclass HFModelDownloadsTool(Tool):\n    name = \"model_download_counter\"\n    description = \"\"\"\n    This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub.\n    It returns the name of the checkpoint.\"\"\"\n    inputs = {\n        \"task\": {\n            \"type\": \"string\",\n            \"description\": \"the task category (such as text-classification, depth-estimation, etc)\",\n        }\n    }\n    output_type = \"string\"\n\n    def forward(self, task: str):\n        from huggingface_hub import list_models\n\n        model = next(iter(list_models(filter=task, sort=\"downloads\", direction=-1)))\n        return model.id\n\nmodel_downloads_tool = HFModelDownloadsTool()\n```\n\nThe custom tool subclasses [`Tool`] to inherit useful methods. The child class also defines:\n- An attribute `name`, which corresponds to the name of the tool itself. The name usually describes what the tool does. Since the code returns the model with the most downloads for a task, let's name it `model_download_counter`.\n- An attribute `description` is used to populate the agent's system prompt.\n- An `inputs` attribute, which is a dictionary with keys `\"type\"` and `\"description\"`. It contains information that helps the Python interpreter make educated choices about the input.\n- An `output_type` attribute, which specifies the output type. The types for both `inputs` and `output_type` should be [Pydantic formats](https://docs.pydantic.dev/latest/concepts/json_schema/#generating-json-schema), they can be either of these: `[\"string\", \"boolean\",\"integer\", \"number\", \"image\", \"audio\", \"array\", \"object\", \"any\", \"null\"]`.\n- A `forward` method which contains the inference code to be executed.\n\nAnd that's all it needs to be used in an agent!\n\nThere's another way to build a tool. In the [guided_tour](../guided_tour), we implemented a tool using the `@tool` decorator. The [`tool`] decorator is the recommended way to define simple tools, but sometimes you need more than this: using several methods in a class for more clarity, or using additional class attributes.\n\nIn this case, you can build your tool by subclassing [`Tool`] as described above.\n\n### Share your tool to the Hub\n\nYou can share your custom tool to the Hub as a Space repository by calling [`~Tool.push_to_hub`] on the tool. Make sure you've created a repository for it on the Hub and are using a token with read access.\n\n```python\nmodel_downloads_tool.push_to_hub(\"{your_username}/hf-model-downloads\", token=\"<YOUR_HUGGINGFACEHUB_API_TOKEN>\")\n```\n\nFor the push to Hub to work, your tool will need to respect some rules:\n- All methods are self-contained, e.g. use variables that come either from their args.\n- As per the above point, **all imports should be defined directly within the tool's functions**, else you will get an error when trying to call [`~Tool.save`] or [`~Tool.push_to_hub`] with your custom tool.\n- If you subclass the `__init__` method, you can give it no other argument than `self`. This is because arguments set during a specific tool instance's initialization are hard to track, which prevents from sharing them properly to the hub. And anyway, the idea of making a specific class is that you can already set class attributes for anything you need to hard-code (just set `your_variable=(...)` directly under the `class YourTool(Tool):` line). And of course you can still create a class attribute anywhere in your code by assigning stuff to `self.your_variable`.\n\n\nOnce your tool is pushed to Hub, you can visualize it. [Here](https://huggingface.co/spaces/m-ric/hf-model-downloads) is the `model_downloads_tool` that I've pushed. It has a nice gradio interface.\n\nWhen diving into the tool files, you can find that all the tool's logic is under [tool.py](https://huggingface.co/spaces/m-ric/hf-model-downloads/blob/main/tool.py). That is where you can inspect a tool shared by someone else.\n\nThen you can load the tool with [`load_tool`] or create it with [`~Tool.from_hub`] and pass it to the `tools` parameter in your agent.\nSince running tools means running custom code, you need to make sure you trust the repository, thus we require to pass `trust_remote_code=True` to load a tool from the Hub.\n\n```python\nfrom smolagents import load_tool, CodeAgent\n\nmodel_download_tool = load_tool(\n    \"{your_username}/hf-model-downloads\",\n    trust_remote_code=True\n)\n```\n\n### Use tools from an MCP server\n\nOur `MCPClient` allows you to load tools from an MCP server, and gives you full control over the connection and tool management:\n\nFor stdio-based MCP servers:\n```python\nfrom smolagents import MCPClient, CodeAgent\nfrom mcp import StdioServerParameters\nimport os\n\nserver_parameters = StdioServerParameters(\n    command=\"uvx\",  # Using uvx ensures dependencies are available\n    args=[\"--quiet\", \"pubmedmcp@0.1.3\"],\n    env={\"UV_PYTHON\": \"3.12\", **os.environ},\n)\n\nwith MCPClient(server_parameters) as tools:\n    agent = CodeAgent(tools=tools, model=model, add_base_tools=True)\n    agent.run(\"Please find the latest research on COVID-19 treatment.\")\n```\n\nFor Streamable HTTP-based MCP servers:\n```python\nfrom smolagents import MCPClient, CodeAgent\n\nwith MCPClient({\"url\": \"http://127.0.0.1:8000/mcp\", \"transport\": \"streamable-http\"}) as tools:\n    agent = CodeAgent(tools=tools, model=model, add_base_tools=True)\n    agent.run(\"Please find a remedy for hangover.\")\n```\n\nYou can also manually manage the connection lifecycle with the try...finally pattern:\n\n```python\nfrom smolagents import MCPClient, CodeAgent\nfrom mcp import StdioServerParameters\nimport os\n\n# Initialize server parameters\nserver_parameters = StdioServerParameters(\n    command=\"uvx\",\n    args=[\"--quiet\", \"pubmedmcp@0.1.3\"],\n    env={\"UV_PYTHON\": \"3.12\", **os.environ},\n)\n\n# Manually manage the connection\ntry:\n    mcp_client = MCPClient(server_parameters)\n    tools = mcp_client.get_tools()\n\n    # Use the tools with your agent\n    agent = CodeAgent(tools=tools, model=model, add_base_tools=True)\n    result = agent.run(\"What are the recent therapeutic approaches for Alzheimer's disease?\")\n\n    # Process the result as needed\n    print(f\"Agent response: {result}\")\nfinally:\n    # Always ensure the connection is properly closed\n    mcp_client.disconnect()\n```\n\nYou can also connect to multiple MCP servers at once by passing a list of server parameters:\n```python\nfrom smolagents import MCPClient, CodeAgent\nfrom mcp import StdioServerParameters\nimport os\n\nserver_params1 = StdioServerParameters(\n    command=\"uvx\",\n    args=[\"--quiet\", \"pubmedmcp@0.1.3\"],\n    env={\"UV_PYTHON\": \"3.12\", **os.environ},\n)\n\nserver_params2 = {\"url\": \"http://127.0.0.1:8000/sse\"}\n\nwith MCPClient([server_params1, server_params2]) as tools:\n    agent = CodeAgent(tools=tools, model=model, add_base_tools=True)\n    agent.run(\"Please analyze the latest research and suggest remedies for headaches.\")\n```\n\n> [!WARNING]\n> **Security Warning:** Always verify the source and integrity of any MCP server before connecting to it, especially for production environments.\n> Using MCP servers comes with security risks:\n> - **Trust is essential:** Only use MCP servers from trusted sources. Malicious servers can execute harmful code on your machine.\n> - **Stdio-based MCP servers** will always execute code on your machine (that's their intended functionality).\n> - **Streamable HTTP-based MCP servers:** While remote MCP servers will not execute code on your machine, still proceed with caution.\n\n#### Structured Output and Output Schema Support\n\nThe latest [MCP specifications (2025-06-18+)](https://modelcontextprotocol.io/specification/2025-06-18/server/tools#structured-content) include support for `outputSchema`, which enables tools to return structured data with defined schemas. `smolagents` takes advantage of these structured output capabilities, allowing agents to work with tools that return complex data structures, JSON objects, and other structured formats. With this feature, the agent's LLMs can \"see\" the structure of the tool output before calling a tool, enabling more intelligent and context-aware interactions.\n\nTo enable structured output support, pass `structured_output=True` when initializing the `MCPClient`:\n\n```python\nfrom smolagents import MCPClient, CodeAgent\n\n# Enable structured output support\nwith MCPClient(server_parameters, structured_output=True) as tools:\n    agent = CodeAgent(tools=tools, model=model, add_base_tools=True)\n    agent.run(\"Get weather information for Paris\")\n```\n\nWhen `structured_output=True`, the following features are enabled:\n- **Output Schema Support**: Tools can define JSON schemas for their outputs\n- **Structured Content Handling**: Support for `structuredContent` in MCP responses\n- **JSON Parsing**: Automatic parsing of structured data from tool responses\n\nHere's an example using a weather MCP server with structured output:\n\n```python\n# demo/weather.py - Example MCP server with structured output\nfrom pydantic import BaseModel, Field\nfrom mcp.server.fastmcp import FastMCP\n\nmcp = FastMCP(\"Weather Service\")\n\nclass WeatherInfo(BaseModel):\n    location: str = Field(description=\"The location name\")\n    temperature: float = Field(description=\"Temperature in Celsius\")\n    conditions: str = Field(description=\"Weather conditions\")\n    humidity: int = Field(description=\"Humidity percentage\", ge=0, le=100)\n\n@mcp.tool(\n    name=\"get_weather_info\",\n    description=\"Get weather information for a location as structured data.\",\n    # structured_output=True is enabled by default in FastMCP\n)\ndef get_weather_info(city: str) -> WeatherInfo:\n    \"\"\"Get weather information for a city.\"\"\"\n    return WeatherInfo(\n        location=city,\n        temperature=22.5,\n        conditions=\"partly cloudy\",\n        humidity=65\n    )\n```\n\nAgent using output schema and structured output:\n\n```python\nfrom smolagents import MCPClient, CodeAgent\n\n# Using the weather server with structured output\nfrom mcp import StdioServerParameters\n\nserver_parameters = StdioServerParameters(\n    command=\"python\",\n    args=[\"demo/weather.py\"]\n)\n\nwith MCPClient(server_parameters, structured_output=True) as tools:\n    agent = CodeAgent(tools=tools, model=model)\n    result = agent.run(\"What is the temperature in Tokyo in Fahrenheit?\")\n    print(result)\n```\n\nWhen structured output is enabled, the `CodeAgent` system prompt is enhanced to include JSON schema information for tools, helping the agent understand the expected structure of tool outputs and access the data appropriately.\n\n**Backwards Compatibility**: The `structured_output` parameter currently defaults to `False` to maintain backwards compatibility. Existing code will continue to work without changes, receiving simple text outputs as before.\n\n**Future Change**: In a future release, the default value of `structured_output` will change from `False` to `True`. It is recommended to explicitly set `structured_output=True` to opt into the enhanced functionality, which provides better tool output handling and improved agent performance. Use `structured_output=False` only if you specifically need to maintain the current text-only behavior.\n\n### Import a Space as a tool\n\nYou can directly import a Gradio Space from the Hub as a tool using the [`Tool.from_space`] method!\n\nYou only need to provide the id of the Space on the Hub, its name, and a description that will help your agent understand what the tool does. Under the hood, this will use [`gradio-client`](https://pypi.org/project/gradio-client/) library to call the Space.\n\nFor instance, let's import the [FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) Space from the Hub and use it to generate an image.\n\n```python\nimage_generation_tool = Tool.from_space(\n    \"black-forest-labs/FLUX.1-schnell\",\n    name=\"image_generator\",\n    description=\"Generate an image from a prompt\"\n)\n\nimage_generation_tool(\"A sunny beach\")\n```\nAnd voilà, here's your image! 🏖️\n\n<img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/sunny_beach.webp\">\n\nThen you can use this tool just like any other tool.  For example, let's improve the prompt `a rabbit wearing a space suit` and generate an image of it. This example also shows how you can pass additional arguments to the agent.\n\n```python\nfrom smolagents import CodeAgent, InferenceClientModel\n\nmodel = InferenceClientModel(model_id=\"Qwen/Qwen3-Next-80B-A3B-Thinking\")\nagent = CodeAgent(tools=[image_generation_tool], model=model)\n\nagent.run(\n    \"Improve this prompt, then generate an image of it.\", additional_args={'user_prompt': 'A rabbit wearing a space suit'}\n)\n```\n\n```text\n=== Agent thoughts:\nimproved_prompt could be \"A bright blue space suit wearing rabbit, on the surface of the moon, under a bright orange sunset, with the Earth visible in the background\"\n\nNow that I have improved the prompt, I can use the image generator tool to generate an image based on this prompt.\n>>> Agent is executing the code below:\nimage = image_generator(prompt=\"A bright blue space suit wearing rabbit, on the surface of the moon, under a bright orange sunset, with the Earth visible in the background\")\nfinal_answer(image)\n```\n\n<img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit_spacesuit_flux.webp\">\n\nHow cool is this? 🤩\n\n### Use LangChain tools\n\nWe love Langchain and think it has a very compelling suite of tools.\nTo import a tool from LangChain, use the `from_langchain()` method.\n\nHere is how you can use it to recreate the intro's search result using a LangChain web search tool.\nThis tool will need `pip install langchain google-search-results -q` to work properly.\n```python\nfrom langchain.agents import load_tools\n\nsearch_tool = Tool.from_langchain(load_tools([\"serpapi\"])[0])\n\nagent = CodeAgent(tools=[search_tool], model=model)\n\nagent.run(\"How many more blocks (also denoted as layers) are in BERT base encoder compared to the encoder from the architecture proposed in Attention is All You Need?\")\n```\n\n### Manage your agent's toolbox\n\nYou can manage an agent's toolbox by adding or replacing a tool in attribute `agent.tools`, since it is a standard dictionary.\n\nLet's add the `model_download_tool` to an existing agent initialized with only the default toolbox.\n\n```python\nfrom smolagents import InferenceClientModel\n\nmodel = InferenceClientModel(model_id=\"Qwen/Qwen3-Next-80B-A3B-Thinking\")\n\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\nagent.tools[model_download_tool.name] = model_download_tool\n```\nNow we can leverage the new tool:\n\n```python\nagent.run(\n    \"Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub but reverse the letters?\"\n)\n```\n\n\n> [!TIP]\n> Beware of not adding too many tools to an agent: this can overwhelm weaker LLM engines.\n\n\n### Use a collection of tools\n\nYou can leverage tool collections by using [`ToolCollection`]. It supports loading either a collection from the Hub or an MCP server tools.\n\n\n#### Tool Collection from any MCP server\n\nLeverage tools from the hundreds of MCP servers available on [glama.ai](https://glama.ai/mcp/servers) or [smithery.ai](https://smithery.ai/).\n\nThe MCP servers tools can be loaded with [`ToolCollection.from_mcp`].\n\n> [!WARNING]\n> **Security Warning:** Always verify the source and integrity of any MCP server before connecting to it, especially for production environments.\n> Using MCP servers comes with security risks:\n> - **Trust is essential:** Only use MCP servers from trusted sources. Malicious servers can execute harmful code on your machine.\n> - **Stdio-based MCP servers** will always execute code on your machine (that's their intended functionality).\n> - **Streamable HTTP-based MCP servers:** While remote MCP servers will not execute code on your machine, still proceed with caution.\n\nFor stdio-based MCP servers, pass the server parameters as an instance of `mcp.StdioServerParameters`:\n```py\nfrom smolagents import ToolCollection, CodeAgent\nfrom mcp import StdioServerParameters\n\nserver_parameters = StdioServerParameters(\n    command=\"uvx\",\n    args=[\"--quiet\", \"pubmedmcp@0.1.3\"],\n    env={\"UV_PYTHON\": \"3.12\", **os.environ},\n)\n\nwith ToolCollection.from_mcp(server_parameters, trust_remote_code=True) as tool_collection:\n    agent = CodeAgent(tools=[*tool_collection.tools], model=model, add_base_tools=True)\n    agent.run(\"Please find a remedy for hangover.\")\n```\n\nTo enable structured output support with ToolCollection, add the `structured_output=True` parameter:\n```py\nwith ToolCollection.from_mcp(server_parameters, trust_remote_code=True, structured_output=True) as tool_collection:\n    agent = CodeAgent(tools=[*tool_collection.tools], model=model, add_base_tools=True)\n    agent.run(\"Please find a remedy for hangover.\")\n```\n\nFor Streamable HTTP-based MCP servers, simply pass a dict with parameters to `mcp.client.streamable_http.streamablehttp_client` and add the key `transport` with the value `\"streamable-http\"`:\n```py\nfrom smolagents import ToolCollection, CodeAgent\n\nwith ToolCollection.from_mcp({\"url\": \"http://127.0.0.1:8000/mcp\", \"transport\": \"streamable-http\"}, trust_remote_code=True) as tool_collection:\n    agent = CodeAgent(tools=[*tool_collection.tools], add_base_tools=True)\n    agent.run(\"Please find a remedy for hangover.\")\n```\n\n#### Tool Collection from a collection in the Hub\n\nYou can leverage it with the slug of the collection you want to use.\nThen pass them as a list to initialize your agent, and start using them!\n\n```py\nfrom smolagents import ToolCollection, CodeAgent\n\nimage_tool_collection = ToolCollection.from_hub(\n    collection_slug=\"huggingface-tools/diffusion-tools-6630bb19a942c2306a2cdb6f\",\n    token=\"<YOUR_HUGGINGFACEHUB_API_TOKEN>\"\n)\nagent = CodeAgent(tools=[*image_tool_collection.tools], model=model, add_base_tools=True)\n\nagent.run(\"Please draw me a picture of rivers and lakes.\")\n```\n\nTo speed up the start, tools are loaded only if called by the agent.\n\n"
  },
  {
    "path": "docs/source/es/_config.py",
    "content": "# docstyle-ignore\nINSTALL_CONTENT = \"\"\"\n# Installation\n! pip install smolagents\n# To install from source instead of the last release, comment the command above and uncomment the following one.\n# ! pip install git+https://github.com/huggingface/smolagents.git\n\"\"\"\n\nnotebook_first_cells = [{\"type\": \"code\", \"content\": INSTALL_CONTENT}]\nblack_avoid_patterns = {\n    \"{processor_class}\": \"FakeProcessorClass\",\n    \"{model_class}\": \"FakeModelClass\",\n    \"{object_class}\": \"FakeObjectClass\",\n}\n"
  },
  {
    "path": "docs/source/es/_toctree.yml",
    "content": "- title: Primeros Pasos\n  sections:\n  - local: index\n    title: Introducción\n  - local: installation\n    title: Opciones de instalación\n#   - local: guided_tour\n#     title: Guided tour\n# - title: Tutorials\n#   sections:\n#   - local: tutorials/building_good_agents\n#     title: ✨ Building good agents\n#   - local: tutorials/inspect_runs\n#     title: 📊 Inspect your agent runs using telemetry\n#   - local: tutorials/tools\n#     title: 🛠️ Tools - in-depth guide\n#   - local: tutorials/secure_code_execution\n#     title: 🛡️ Secure code execution\n#   - local: tutorials/memory\n#     title: 📚 Manage your agent's memory\n# - title: Conceptual guides\n#   sections:\n#   - local: conceptual_guides/intro_agents\n#     title: 🤖 What are agents?\n#   - local: conceptual_guides/react\n#     title: 🤔 How do Multi-step agents work?\n# - title: Examples\n#   sections:\n#   - local: examples/text_to_sql\n#     title: Self-correcting Text-to-SQL\n#   - local: examples/rag\n#     title: Master your knowledge base with agentic RAG\n#   - local: examples/multiagents\n#     title: Orchestrate a multi-agent system\n#   - local: examples/web_browser\n#     title: Build a web browser agent using vision models\n#   - local: examples/using_different_models\n#     title: Using different models\n#   - local: examples/plan_customization\n#     title: \"Human-in-the-Loop: Customize agent plan interactively\"\n#   - local: examples/async_agent\n#     title: Async Applications with Agents\n# - title: Reference\n#   sections:\n#   - local: reference/agents\n#     title: Agent-related objects\n#   - local: reference/models\n#     title: Model-related objects\n#   - title: Tools\n#     sections:\n#     - title: Tool-related objects\n#       local: reference/tools\n#     - title: Built-in Tools\n#       local: reference/default_tools\n"
  },
  {
    "path": "docs/source/es/index.md",
    "content": "# `smolagents`\n\n<div class=\"flex justify-center\">\n    <img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/license_to_call.png\" style=\"max-width:700px\"/>\n</div>\n\n## ¿Qué es smolagents?\n\n`smolagents` es una biblioteca de código abierto en Python, diseñada para facilitar al máximo la construcción y ejecución de agentes con solo unas pocas líneas de código.\n\nAlgunos aspectos clave de `smolagents` incluyen:\n\n✨ **Simplicidad**: La lógica de los agentes se implementa en aproximadamente unas mil líneas de código. ¡Lo hemos mantenido simple, sin agregar complejidad innecesaria!\n\n🧑‍💻 **Soporte avanzado para Agentes de Código**: [`CodeAgent`](reference/agents#smolagents.CodeAgent) ejecuta acciones directamente en código (en lugar de que los agentes generen código), lo que permite usar varias herramientas o realizar cálculos de manera flexible. Esto hace posible combinar de manera sencilla funciones anidadas, bucles, condicionales y mucho más. Para garantizar la seguridad, el agente puede [ejecutarse en un entorno aislado](tutorials/secure_code_execution) usando [E2B](https://e2b.dev/) o Docker.\n\n📡 **Integración nativa con agentes de herramientas**: además de los CodeAgent,  [`ToolCallingAgent`](reference/agents#smolagents.ToolCallingAgent) es compatible con el esquema tradicional basado en JSON/texto para casos en los que se prefiera este formato.\n\n🤗 **Integraciones con el Hub**: mediante Gradio Spaces es posible compartir y cargar múltiples agentes junto con herramientas desde o hacia el Hub de manera sencilla.\n\n🌐 **Independencia respecto al modelo**: integra fácilmente grandes modelos de lenguaje (LLM) alojados en el Hub mediante los [proveedores de inferencia](https://huggingface.co/docs/inference-providers/index), APIs externas como OpenAI, Anthropic y muchos otros a través de la integración con LiteLLM. Además, es posible ejecutar localmente estos sistemas utilizando Transformers u Ollama. Es sencillo y flexible potenciar un agente con tu LLM preferido.\n\n👁️ **Independencia respecto a la modalidad**: los agentes pueden procesar diferentes tipos de entrada (_inputs_) como texto, visión, video y audio, ampliando considerablemente el rango de aplicaciones posibles. Consulta este [tutorial](https://huggingface.co/docs/smolagents/v1.21.0/en/examples/web_browser) sobre el área de visión.\n\n🛠️ **Independencia respecto a las herramientas**: existe una gran variedad de herramientas en cualquier [Servidor MCP](reference/tools#smolagents.ToolCollection.from_mcp), marcos de orquestación como [LangChain](reference/tools#smolagents.Tool.from_langchain) e incluso existe la posibilidad de usar el [Hub Space](reference/tools#smolagents.Tool.from_space) como herramienta.\n\n💻 **Herramientas de CLI**: incluye utilidades en línea de comandos (smolagent, webagent) para ejecutar agentes rápidamente sin código repetitivo.\n\n## Inicio Rápido\n\n[[open-in-colab]]\n\n¡Comienza a usar smolagents en solo unos minutos! Esta guía te mostrará cómo crear y ejecutar tu primer agente.\n\n### Instalación\n\nInstala smolagents usando pip:\n\n```bash\npip install smolagents[toolkit]  # Incluye herramientas básicas como búsqueda web.\n```\n\n### Crea tu Primer Agente\n\nA continuación se detalla un ejemplo básico para crear y ejecutar un agente:\n\n\n```python\nfrom smolagents import CodeAgent, InferenceClientModel\n\n# Iniciar el modelo (utilizando la API de Hugging Face Inference)\nmodel = InferenceClientModel()  # Utiliza el modelo por defecto\n\n# Crear un agente sin herramientas\nagent = CodeAgent(tools=[], model=model)\n\n# Ejecuta el agente con una tarea específica\nresult = agent.run(\"Calculate the sum of numbers from 1 to 10\")\nprint(result)\n```\n¡Eso es todo! El agente usará Python para completar la tarea y entregar el resultado.\n\n### Agregar Herramientas\n\nMejoremos las capacidades de nuestro agente añadiendo algunas herramientas:\n\n```python\nfrom smolagents import CodeAgent, InferenceClientModel, DuckDuckGoSearchTool\n\nmodel = InferenceClientModel()\nagent = CodeAgent(\n    tools=[DuckDuckGoSearchTool()],\n    model=model,\n)\n\n# ¡Ahora el agente puede buscar información en Internet!\nresult = agent.run(\"What is the current weather in Paris?\")\nprint(result)\n```\n\n### Usar Modelos Diferentes\n\nPuedes usar diferentes modelos con los agentes:\n\n```python\n# Usar un modelo específico de Hugging Face\nmodel = InferenceClientModel(model_id=\"meta-llama/Llama-2-70b-chat-hf\")\n\n# Usar la API de OpenAI/Anthropic (requiere smolagents[litellm])\nfrom smolagents import LiteLLMModel\nmodel = LiteLLMModel(model_id=\"gpt-4\")\n\n# Utilizar modelos locales (requiere smolagents[transformers])\nfrom smolagents import TransformersModel\nmodel = TransformersModel(model_id=\"meta-llama/Llama-2-7b-chat-hf\")\n```\n\n## Próximos Pasos\n\n- Aprende a configurar smolagents con diferentes modelos y herramientas en la [Guía de Instalación](installation).\n- Revisa el [Tutorial Guiado](guided_tour) y aprende a usar funciones más avanzadas.\n- Aprende a construir [herramientas personalizadas](tutorials/tools).\n- Conoce más sobre la [ejecución segura de código](tutorials/secure_code_execution).\n- Explora el desarrollo de [sistemas multiagente](tutorials/building_good_agents).\n\n<div class=\"mt-10\">\n  <div class=\"w-full flex flex-col space-y-4 md:space-y-0 md:grid md:grid-cols-2 md:gap-y-4 md:gap-x-5\">\n    <a class=\"!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg\" href=\"./guided_tour\"\n      ><div class=\"w-full text-center bg-gradient-to-br from-blue-400 to-blue-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed\">Tutorial Guiado</div>\n      <p class=\"text-gray-700\">Domina los conceptos básicos y aprende a manejar agentes. Empieza aquí si nunca los has utilizado.</p>\n    </a>\n    <a class=\"!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg\" href=\"./examples/text_to_sql\"\n      ><div class=\"w-full text-center bg-gradient-to-br from-indigo-400 to-indigo-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed\">Guías prácticas</div>\n      <p class=\"text-gray-700\">Ejemplos prácticos para guiarte en diferentes proyectos. ¡Desarrolla un agente que genere y valide consultas SQL!</p>\n    </a>\n    <a class=\"!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg\" href=\"./conceptual_guides/intro_agents\"\n      ><div class=\"w-full text-center bg-gradient-to-br from-pink-400 to-pink-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed\">Guías Conceptuales</div>\n      <p class=\"text-gray-700\">Conceptos avanzados para profundizar en la comprensión de temas clave.</p>\n   </a>\n    <a class=\"!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg\" href=\"./tutorials/building_good_agents\"\n      ><div class=\"w-full text-center bg-gradient-to-br from-purple-400 to-purple-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed\">Tutoriales</div>\n      <p class=\"text-gray-700\">Tutoriales completos que cubren aspectos clave para el desarrollo de agentes.</p>\n    </a>\n  </div>\n</div>\n"
  },
  {
    "path": "docs/source/es/installation.md",
    "content": "# Opciones de instalación\n\nLa biblioteca `smolagents` se puede instalar usando pip. Existen varias formas y opciones disponibles para realizar la instalación.\n\n## Requisitos Previos\n- Python 3.10 o una versión más reciente\n- Gestor de paquetes para Python: [`pip`](https://pip.pypa.io/en/stable/) o [`uv`](https://docs.astral.sh/uv/)\n\n## Entorno Virtual\n\nInstalar `smolagents` en un entorno virtual de Python es altamente recomendable. Los entornos virtuales permiten mantener las dependencias \nde tu proyecto aisladas tanto de otros proyectos como de Python en el sistema, evitando conflictos de versiones y simplificando la administración de paquetes.\n\n<hfoptions id=\"virtual-environment\">\n<hfoption id=\"venv\">\nUsando [`venv`](https://docs.python.org/3/library/venv.html):\n```bash\npython -m venv .venv\nsource .venv/bin/activate\n```\n</hfoption>\n<hfoption id=\"uv\">\n\nUsando [`uv`](https://docs.astral.sh/uv/):\n```bash\nuv venv .venv\nsource .venv/bin/activate\n```\n</hfoption>\n</hfoptions>\n\n## Instalación Básica\n\nPara instalar la biblioteca principal (core) de smolagents, usa:\n\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n```bash\npip install smolagents\n```\n</hfoption>\n<hfoption id=\"uv\">\n```bash\nuv pip install smolagents\n```\n</hfoption>\n</hfoptions>\n\n## Instalación con Complementos\n\nExisten dependencias adicionales (extras) en `smolagents` que puedes instalar conforme a tus necesidades.\nLa instalación de estos extras se realiza con la siguiente sintaxis:\n\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n```bash\npip install \"smolagents[extra1,extra2]\"\n```\n</hfoption>\n<hfoption id=\"uv\">\n```bash\nuv pip install \"smolagents[extra1,extra2]\"\n```\n</hfoption>\n</hfoptions>\n\n### Herramientas\n\nEstos complementos incluyen diversas herramientas e integraciones:\n\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n- **toolkit**: Instala un paquete estándar de herramientas para tareas habituales.\n  ```bash\n  pip install \"smolagents[toolkit]\"\n  ```\n- **mcp**: Incorpora el Protocolo de Contexto de Modelo (MCP) para facilitar la integración de herramientas y servicios externos.\n  ```bash\n  pip install \"smolagents[mcp]\"\n  ```\n</hfoption>\n<hfoption id=\"uv\">\n- **toolkit**: Instala un paquete estándar de herramientas para tareas habituales.\n  ```bash\n  uv pip install \"smolagents[toolkit]\"\n  ```\n- **mcp**: Incorpora el Protocolo de Contexto de Modelo (MCP) para facilitar la integración de herramientas y servicios externos.\n  ```bash\n  uv pip install \"smolagents[mcp]\"\n  ```\n</hfoption>\n</hfoptions>\n\n### Integración de Modelos\n\nLas funcionalidades adicionales facilitan la conexión con diversos modelos y frameworks de inteligencia artificial.\n\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n- **openai**: Integración para los modelos de OpenAI a través de API.\n  ```bash\n  pip install \"smolagents[openai]\"\n  ```\n- **transformers**: Permite el uso de modelos Transformers de Hugging Face.\n  ```bash\n  pip install \"smolagents[transformers]\"\n  ```\n- **vllm**: Agrega compatibilidad con vLLM para una inferencia de modelos más eficiente.\n  ```bash\n  pip install \"smolagents[vllm]\"\n  ```\n- **mlx-lm**: Incorpora funcionalidades específicas para MLX-LM.\n  ```bash\n  pip install \"smolagents[mlx-lm]\"\n  ```\n- **litellm**: Habilita el uso de LiteLLM en tareas de inferencia con modelos optimizados.\n  ```bash\n  pip install \"smolagents[litellm]\"\n  ```\n- **bedrock**:  Amplía la compatibilidad con servicios de modelos alojados en AWS Bedrock.\n  ```bash\n  pip install \"smolagents[bedrock]\"\n  ```\n</hfoption>\n<hfoption id=\"uv\">\n- **openai**: Integración para los modelos de OpenAI a través de API.\n  ```bash\n  uv pip install \"smolagents[openai]\"\n  ```\n- **transformers**: Permite el uso de modelos Transformers de Hugging Face.\n  ```bash\n  uv pip install \"smolagents[transformers]\"\n  ```\n- **vllm**: Agrega compatibilidad con vLLM para una inferencia de modelos más eficiente.\n  ```bash\n  uv pip install \"smolagents[vllm]\"\n  ```\n- **mlx-lm**: Incorpora funcionalidades específicas para MLX-LM.\n  ```bash\n  uv pip install \"smolagents[mlx-lm]\"\n  ```\n- **litellm**: Habilita el uso de LiteLLM en tareas de inferencia con modelos optimizados.\n  ```bash\n  uv pip install \"smolagents[litellm]\"\n  ```\n- **bedrock**: Amplía la compatibilidad con servicios de modelos alojados en AWS Bedrock.\n  ```bash\n  uv pip install \"smolagents[bedrock]\"\n  ```\n</hfoption>\n</hfoptions>\n\n### Capacidades Multimodales\n\nFunciones adicionales para procesar varios tipos de datos:\n\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n- **vision**: Despliega funciones avanzadas para el procesamiento de imágenes y visión por computadora.\n  ```bash\n  pip install \"smolagents[vision]\"\n  ```\n- **audio**: Incorpora soporte para tareas de procesamiento de audio.\n  ```bash\n  pip install \"smolagents[audio]\"\n  ```\n</hfoption>\n<hfoption id=\"uv\">\n- **vision**: Despliega funciones avanzadas para el procesamiento de imágenes y visión por computadora.\n  ```bash\n  uv pip install \"smolagents[vision]\"\n  ```\n- **audio**: Incorpora soporte para tareas de procesamiento de audio.\n  ```bash\n  uv pip install \"smolagents[audio]\"\n  ```\n</hfoption>\n</hfoptions>\n\n### Ejecución Remota\n\nExtensiones para ejecutar código a distancia:\n\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n- **docker**: Funcionalidad para ejecutar scripts en entornos Docker.\n  ```bash\n  pip install \"smolagents[docker]\"\n  ```\n- **e2b**: Facilita la ejecución remota mediante soporte E2B.\n  ```bash\n  pip install \"smolagents[e2b]\"\n  ```\n</hfoption>\n<hfoption id=\"uv\">\n- **docker**: Funcionalidad para ejecutar scripts en entornos Docker.\n  ```bash\n  uv pip install \"smolagents[docker]\"\n  ```\n- **e2b**: Facilita la ejecución remota mediante soporte E2B.\n  ```bash\n  uv pip install \"smolagents[e2b]\"\n  ```\n</hfoption>\n</hfoptions>\n\n### Telemetría e Interfaz de Usuario\n\nMódulos complementarios para telemetría, monitoreo y diseño de interfaz:\n\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n- **telemetry**: Agrega funcionalidades para actividades de monitoreo y trazabilidad.\n  ```bash\n  pip install \"smolagents[telemetry]\"\n  ```\n- **gradio**: Permite la utilización de componentes interactivos en Gradio UI.\n  ```bash\n  pip install \"smolagents[gradio]\"\n  ```\n</hfoption>\n<hfoption id=\"uv\">\n- **telemetry**: Agrega funcionalidades para actividades de monitoreo y trazabilidad.\n  ```bash\n  uv pip install \"smolagents[telemetry]\"\n  ```\n- **gradio**: Permite la utilización de componentes interactivos en Gradio UI.\n  ```bash\n  uv pip install \"smolagents[gradio]\"\n  ```\n</hfoption>\n</hfoptions>\n\n### Instalación Completa\n\nPara instalar todos los complementos disponibles, puedes usar:\n\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n```bash\npip install \"smolagents[all]\"\n```\n</hfoption>\n<hfoption id=\"uv\">\n```bash\nuv pip install \"smolagents[all]\"\n```\n</hfoption>\n</hfoptions>\n\n## Verificación de la Instalación\n\nDespués de la instalación, puedes verificar que `smolagents` esté instalado correctamente ejecutando:\n\n```python\nimport smolagents\nprint(smolagents.__version__)\n```\n\n## Próximos Pasos\n\nUna vez que `smolagents` esté instalado correctamente, puedes:\n\n- Aprende los conceptos básicos revisando el [Tutorial Guiado](guided_tour).\n- Explora los ejemplos prácticos y aplicaciones en las [Guías Prácticas](examples/text_to_sql).\n- Profundiza en los conceptos avanzados mediante las [Guías Conceptuales](conceptual_guides/intro_agents).\n- Revisa los [Tutoriales](tutorials/building_good_agents) para el desarrollo de agentes.\n- Consulta la [Documentación API](./reference/index) para obtener información detallada sobre clases y funciones.\n"
  },
  {
    "path": "docs/source/hi/_config.py",
    "content": "# docstyle-ignore\nINSTALL_CONTENT = \"\"\"\n# Installation\n! pip install smolagents\n# To install from source instead of the last release, comment the command above and uncomment the following one.\n# ! pip install git+https://github.com/huggingface/smolagents.git\n\"\"\"\n\nnotebook_first_cells = [{\"type\": \"code\", \"content\": INSTALL_CONTENT}]\nblack_avoid_patterns = {\n    \"{processor_class}\": \"FakeProcessorClass\",\n    \"{model_class}\": \"FakeModelClass\",\n    \"{object_class}\": \"FakeObjectClass\",\n}\n"
  },
  {
    "path": "docs/source/hi/_toctree.yml",
    "content": "- title: Get started\n  sections:\n  - local: index\n    title: 🤗 Agents\n  - local: guided_tour\n    title: गाइडेड टूर\n- title: Tutorials\n  sections:\n  - local: tutorials/building_good_agents\n    title: ✨ अच्छे Agents का निर्माण\n  - local: tutorials/inspect_runs\n    title: 📊 OpenTelemetry के साथ runs का निरीक्षण\n  - local: tutorials/tools\n    title: 🛠️ Tools - in-depth guide\n  - local: tutorials/secure_code_execution\n    title: 🛡️ E2B के साथ अपने कोड एक्जीक्यूशन को सुरक्षित करें\n- title: Conceptual guides\n  sections:\n  - local: conceptual_guides/intro_agents\n    title: 🤖 Agentic सिस्टम का परिचय\n  - local: conceptual_guides/react\n    title: 🤔 मल्टी-स्टेप एजेंट कैसे काम करते हैं?\n- title: Examples\n  sections:\n  - local: examples/text_to_sql\n    title: सेल्फ करेक्टिंग Text-to-SQL\n  - local: examples/rag\n    title: एजेंटिक RAG के साथ अपनी ज्ञान आधारित को मास्टर करें\n  - local: examples/multiagents\n    title: एक बहु-एजेंट प्रणाली का आयोजन करें\n- title: Reference\n  sections:\n  - local: reference/agents\n    title: एजेंट से संबंधित ऑब्जेक्ट्स\n  - local: reference/tools\n    title: टूल्स से संबंधित ऑब्जेक्ट्स\n"
  },
  {
    "path": "docs/source/hi/conceptual_guides/intro_agents.md",
    "content": "# Agents का परिचय\n\n## 🤔 Agents क्या हैं?\n\nAI का उपयोग करने वाली किसी भी कुशल प्रणाली को LLM को वास्तविक दुनिया तक किसी प्रकार की पहुंच प्रदान करने की आवश्यकता होगी: उदाहरण के लिए बाहरी जानकारी प्राप्त करने के लिए एक खोज टूल को कॉल करने की संभावना, या किसी कार्य को हल करने के लिए कुछ प्रोग्राम पर कार्य करने की। दूसरे शब्दों में, LLM में ***agency*** होनी चाहिए। एजेंटिक प्रोग्राम LLM के लिए बाहरी दुनिया का प्रवेश द्वार हैं।\n\n> [!TIP]\n> AI Agents वे **प्रोग्राम हैं जहां LLM आउटपुट वर्कफ़्लो को नियंत्रित करते हैं**।\n\nLLM का उपयोग करने वाली कोई भी प्रणाली LLM आउटपुट को कोड में एकीकृत करेगी। कोड वर्कफ़्लो पर LLM के इनपुट का प्रभाव सिस्टम में LLM की एजेंसी का स्तर है।\n\nध्यान दें कि इस परिभाषा के साथ, \"agent\" एक अलग, 0 या 1 परिभाषा नहीं है: इसके बजाय, \"agency\" एक निरंतर स्पेक्ट्रम पर विकसित होती है, जैसे-जैसे आप अपने वर्कफ़्लो पर LLM को अधिक या कम शक्ति देते हैं।\n\nनीचे दी गई तालिका में देखें कि कैसे एजेंसी विभिन्न प्रणालियों में भिन्न हो सकती है:\n\n| एजेंसी स्तर | विवरण | इसे क्या कहा जाता है | उदाहरण पैटर्न |\n|------------|---------|-------------------|----------------|\n| ☆☆☆ | LLM आउटपुट का प्रोग्राम प्रवाह पर कोई प्रभाव नहीं | सरल प्रोसेसर | `process_llm_output(llm_response)` |\n| ★☆☆ | LLM आउटपुट if/else स्विच निर्धारित करता है | राउटर | `if llm_decision(): path_a() else: path_b()` |\n| ★★☆ | LLM आउटपुट फंक्शन एक्जीक्यूशन निर्धारित करता है | टूल कॉलर | `run_function(llm_chosen_tool, llm_chosen_args)` |\n| ★★★ | LLM आउटपुट पुनरावृत्ति और प्रोग्राम की निरंतरता को नियंत्रित करता है | मल्टी-स्टेप एजेंट | `while llm_should_continue(): execute_next_step()` |\n| ★★★ | एक एजेंटिक वर्कफ़्लो दूसरे एजेंटिक वर्कफ़्लो को शुरू कर सकता है | मल्टी-एजेंट | `if llm_trigger(): execute_agent()` |\n\nमल्टी-स्टेप agent की यह कोड संरचना है:\n\n```python\nmemory = [user_defined_task]\nwhile llm_should_continue(memory): # यह लूप मल्टी-स्टेप भाग है\n    action = llm_get_next_action(memory) # यह टूल-कॉलिंग भाग है\n    observations = execute_action(action)\n    memory += [action, observations]\n```\n\nयह एजेंटिक सिस्टम एक लूप में चलता है, प्रत्येक चरण में एक नई क्रिया को शुरू करता है (क्रिया में कुछ पूर्व-निर्धारित *tools* को कॉल करना शामिल हो सकता है जो केवल फंक्शंस हैं), जब तक कि उसके अवलोकन से यह स्पष्ट न हो जाए कि दिए गए कार्य को हल करने के लिए एक संतोषजनक स्थिति प्राप्त कर ली गई है।\n\n## ✅ Agents का उपयोग कब करें / ⛔ कब उनसे बचें\n\nAgents तब उपयोगी होते हैं जब आपको किसी ऐप के वर्कफ़्लो को निर्धारित करने के लिए LLM की आवश्यकता होती है। लेकिन वे अक्सर जरूरत से ज्यादा होते हैं। सवाल यह है कि, क्या मुझे वास्तव में दिए गए कार्य को कुशलतापूर्वक हल करने के लिए वर्कफ़्लो में लचीलेपन की आवश्यकता है?\nयदि पूर्व-निर्धारित वर्कफ़्लो बहुत बार विफल होता है, तो इसका मतलब है कि आपको अधिक लचीलेपन की आवश्यकता है।\n\nआइए एक उदाहरण लेते हैं: मान लीजिए आप एक ऐप बना रहे हैं जो एक सर्फिंग ट्रिप वेबसाइट पर ग्राहक अनुरोधों को संभालता है।\n\nआप पहले से जान सकते हैं कि अनुरोध 2 में से किसी एक श्रेणी में आएंगे (उपयोगकर्ता की पसंद के आधार पर), और आपके पास इन 2 मामलों में से प्रत्येक के लिए एक पूर्व-निर्धारित वर्कफ़्लो है।\n\n1. ट्रिप के बारे में कुछ जानकारी चाहिए? ⇒ उन्हें अपने नॉलेज बेस में खोज करने के लिए एक सर्च बार तक पहुंच दें\n2. सेल्स टीम से बात करना चाहते हैं? ⇒ उन्हें एक संपर्क फॉर्म में टाइप करने दें।\n\nयदि वह निर्धारणात्मक वर्कफ़्लो सभी प्रश्नों के लिए फिट बैठता है, तो बेशक बस सब कुछ कोड करें! यह आपको एक 100% विश्वसनीय सिस्टम देगा और एलएलएम द्वारा अनपेक्षित कार्यप्रवाह में हस्तक्षेप करने से त्रुटियों का कोई जोखिम नहीं होगा। साधारणता और मजबूती के लिए, सलाह दी जाती है कि एजेंटिक व्यवहार का उपयोग न किया जाए।\n\nलेकिन क्या होगा अगर वर्कफ़्लो को पहले से इतनी अच्छी तरह से निर्धारित नहीं किया जा सकता?\n\nउदाहरण के लिए, एक उपयोगकर्ता पूछना चाहता है: `\"मैं सोमवार को आ सकता हूं, लेकिन मैं अपना पासपोर्ट भूल गया जिससे मुझे बुधवार तक देर हो सकती है, क्या आप मुझे और मेरी चीजों को मंगलवार सुबह सर्फ करने ले जा सकते हैं, क्या मुझे कैंसलेशन इंश्योरेंस मिल सकता है?\"` यह प्रश्न कई कारकों पर निर्भर करता है, और शायद ऊपर दिए गए पूर्व-निर्धारित मानदंडों में से कोई भी इस अनुरोध के लिए पर्याप्त नहीं होगा।\n\nयदि पूर्व-निर्धारित वर्कफ़्लो बहुत बार विफल होता है, तो इसका मतलब है कि आपको अधिक लचीलेपन की आवश्यकता है।\n\nयहीं पर एक एजेंटिक सेटअप मदद करता है।\n\nऊपर दिए गए उदाहरण में, आप बस एक मल्टी-स्टेप agent बना सकते हैं जिसके पास मौसम पूर्वानुमान के लिए एक मौसम API, यात्रा की दूरी जानने के लिए के लिए Google Maps API, एक कर्मचारी उपलब्धता डैशबोर्ड और आपके नॉलेज बेस पर एक RAG सिस्टम तक पहुंच है।\n\nहाल ही तक, कंप्यूटर प्रोग्राम पूर्व-निर्धारित वर्कफ़्लो तक सीमित थे, if/else स्विच का\nढेर लगाकार जटिलता को संभालने का प्रयास कर रहे थे। वे बेहद संकीर्ण कार्यों पर केंद्रित थे, जैसे \"इन संख्याओं का योग निकालें\" या \"इस ग्राफ़ में सबसे छोटा रास्ता खोजें\"। लेकिन वास्तव में, अधिकांश वास्तविक जीवन के कार्य, जैसे ऊपर दिया गया हमारा यात्रा उदाहरण, पूर्व-निर्धारित वर्कफ़्लो में फिट नहीं होते हैं। एजेंटिक सिस्टम प्रोग्राम के लिए वास्तविक दुनिया के कार्यों की विशाल दुनिया खोलते हैं!\n\n## क्यों `smolagents`?\n\nकुछ लो-लेवल एजेंटिक उपयोग के मामलों के लिए, जैसे चेन या राउटर, आप सभी कोड खुद लिख सकते हैं। आप इस तरह से बहुत बेहतर होंगे, क्योंकि यह आपको अपने सिस्टम को बेहतर ढंग से नियंत्रित और समझने की अनुमति देगा।\n\nलेकिन जैसे ही आप अधिक जटिल व्यवहारों की ओर बढ़ते हैं जैसे कि LLM को एक फ़ंक्शन कॉल करने देना (यह \"tool calling\" है) या LLM को एक while लूप चलाने देना (\"multi-step agent\"), कुछ एब्सट्रैक्शन्स की आवश्यकता होती है:\n- टूल कॉलिंग के लिए, आपको एजेंट के आउटपुट को पार्स करने की आवश्यकता होती है, इसलिए इस आउटपुट को एक पूर्व-निर्धारित प्रारूप की आवश्यकता होती है जैसे \"विचार: मुझे 'get_weather' टूल कॉल करना चाहिए। क्रिया: get_weather(Paris)।\", जिसे आप एक पूर्व-निर्धारित फ़ंक्शन के साथ पार्स करते हैं, और LLM को दिए गए सिस्टम प्रॉम्प्ट को इस प्रारूप के बारे में सूचित करना चाहिए।\n- एक मल्टी-स्टेप एजेंट के लिए जहां LLM आउटपुट लूप को निर्धारित करता है, आपको पिछले लूप इटरेशन में क्या हुआ इसके आधार पर LLM को एक अलग प्रॉम्प्ट देने की आवश्यकता होती है: इसलिए आपको किसी प्रकार की मेमोरी की आवश्यकता होती है।\n\nइन दो उदाहरणों के साथ, हमने पहले ही कुछ चीजों की आवश्यकता का पता लगा लिया:\n\n- बेशक, एक LLM जो सिस्टम को पावर देने वाले इंजन के रूप में कार्य करता है\n- एजेंट द्वारा एक्सेस किए जा सकने वाले टूल्स की एक सूची\n- एक पार्सर जो LLM आउटपुट से टूल कॉल को निकालता है\n- एक सिस्टम प्रोम्प्ट जो पार्सर के साथ सिंक्रनाइज़ होता है\n- एक मेमोरी\n\nलेकिन रुकिए, चूंकि हम निर्णयों में LLM को जगह देते हैं, निश्चित रूप से वे गलतियां करेंगे: इसलिए हमें एरर लॉगिंग और पुनः प्रयास तंत्र की आवश्यकता है।\n\nये सभी तत्व एक अच्छे कामकाजी सिस्टम बनाने के लिए एक-दूसरे से घनिष्ठ रूप से जुड़े हुए हैं। यही कारण है कि हमने तय किया कि इन सभी चीजों को एक साथ काम करने के लिए बुनियादी निर्माण ब्लॉक्स की आवश्यकता है।\n\n## कोड Agents\n\nएक मल्टी-स्टेप एजेंट में, प्रत्येक चरण पर, LLM बाहरी टूल्स को कुछ कॉल के रूप में एक क्रिया लिख सकता है। इन क्रियाओं को लिखने के लिए एक सामान्य स्वरूप (Anthropic, OpenAI और कई अन्य द्वारा उपयोग किया जाता है) आमतौर पर \"टूल्स के नाम और उपयोग करने के लिए तर्कों के JSON के रूप में क्रियाएं लिखने\" के विभिन्न रूप होते हैं, जिन्हें आप फिर पार्स करते हैं यह जानने के लिए कि कौन सा टूल किन तर्कों के साथ निष्पादित करना है\"।\n\n[कई](https://huggingface.co/papers/2402.01030) [शोध](https://huggingface.co/papers/2411.01747) [पत्रों](https://huggingface.co/papers/2401.00812) ने दिखाया है कि कोड में टूल कॉलिंग LLM का होना बहुत बेहतर है।\n\nइसका कारण बस यह है कि *हमने अपनी कोड भाषाओं को विशेष रूप से कंप्यूटर द्वारा किए गए कार्यों को व्यक्त करने का सर्वोत्तम संभव तरीका बनाने के लिए तैयार किया*। यदि JSON स्निपेट्स बेहतर अभिव्यक्ति होते, तो JSON शीर्ष प्रोग्रामिंग भाषा होती और प्रोग्रामिंग नरक में होती।\n\nनीचे दी गई छवि, [Executable Code Actions Elicit Better LLM Agents](https://huggingface.co/papers/2402.01030) से ली गई है, जो कोड में क्रियाएं लिखने के कुछ फायदे दर्शाती है:\n\n<img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/code_vs_json_actions.png\">\n\nJSON जैसे स्निपेट्स की बजाय कोड में क्रियाएं लिखने से बेहतर प्राप्त होता है:\n\n- **कम्पोजेबिलिटी:** क्या आप JSON क्रियाओं को एक-दूसरे के भीतर नेस्ट कर सकते हैं, या बाद में पुन: उपयोग करने के लिए JSON क्रियाओं का एक सेट परिभाषित कर सकते हैं, उसी तरह जैसे आप बस एक पायथन फंक्शन परिभाषित कर सकते हैं?\n- **ऑब्जेक्ट प्रबंधन:** आप `generate_image` जैसी क्रिया के आउटपुट को JSON में कैसे स्टोर करते हैं?\n- **सामान्यता:** कोड को सरल रूप से कुछ भी व्यक्त करने के लिए बनाया गया है जो आप कंप्यूटर से करवा सकते हैं।\n- **LLM प्रशिक्षण डेटा में प्रतिनिधित्व:** बहुत सारी गुणवत्तापूर्ण कोड क्रियाएं पहले से ही LLM के ट्रेनिंग डेटा में शामिल हैं जिसका मतलब है कि वे इसके लिए पहले से ही प्रशिक्षित हैं!"
  },
  {
    "path": "docs/source/hi/conceptual_guides/react.md",
    "content": "# मल्टी-स्टेप एजेंट्स कैसे काम करते हैं?\n\nReAct फ्रेमवर्क ([Yao et al., 2022](https://huggingface.co/papers/2210.03629)) वर्तमान में एजेंट्स बनाने का मुख्य दृष्टिकोण है।\n\nनाम दो शब्दों, \"Reason\" (तर्क) और \"Act\" (क्रिया) के संयोजन पर आधारित है। वास्तव में, इस आर्किटेक्चर का पालन करने वाले एजेंट अपने कार्य को उतने चरणों में हल करेंगे जितने आवश्यक हों, प्रत्येक चरण में एक Reasoning कदम होगा, फिर एक Action कदम होगा, जहाँ यह टूल कॉल्स तैयार करेगा जो उसे कार्य को हल करने के करीब ले जाएंगे।\n\nReAct प्रक्रिया में पिछले चरणों की मेमोरी रखना शामिल है।\n\n> [!TIP]\n> मल्टी-स्टेप एजेंट्स के बारे में अधिक जानने के लिए [Open-source LLMs as LangChain Agents](https://huggingface.co/blog/open-source-llms-as-agents) ब्लॉग पोस्ट पढ़ें।\n\nयहाँ एक वीडियो ओवरव्यू है कि यह कैसे काम करता है:\n\n<div class=\"flex justify-center\">\n    <img\n        class=\"block dark:hidden\"\n        src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Agent_ManimCE.gif\"\n    />\n    <img\n        class=\"hidden dark:block\"\n        src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Agent_ManimCE.gif\"\n    />\n</div>\n\n![ReAct एजेंट का फ्रेमवर्क](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/open-source-llms-as-agents/ReAct.png)\n\nहम दो प्रकार के ToolCallingAgent को लागू करते हैं:\n- [`ToolCallingAgent`] अपने आउटपुट में टूल कॉल को JSON के रूप में जनरेट करता है।\n- [`CodeAgent`] ToolCallingAgent का एक नया प्रकार है जो अपने टूल कॉल को कोड के ब्लॉब्स के रूप में जनरेट करता है, जो उन LLM के लिए वास्तव में अच्छी तरह काम करता है जिनका कोडिंग प्रदर्शन मजबूत है।\n"
  },
  {
    "path": "docs/source/hi/examples/multiagents.md",
    "content": "# मल्टी-एजेंट सिस्टम का आयोजन करें 🤖🤝🤖\n\n[[open-in-colab]]\n\nइस नोटबुक में हम एक **मल्टी-एजेंट वेब ब्राउज़र बनाएंगे: एक एजेंटिक सिस्टम जिसमें कई एजेंट वेब का उपयोग करके समस्याओं को हल करने के लिए सहयोग करते हैं!**\n\nयह एक सरल संरचना होगी, जो प्रबंधित वेब खोज एजेंट को रैप करने के लिए `ManagedAgent` ऑब्जेक्ट का उपयोग करता है:\n\n```\n              +----------------+\n              | Manager agent  |\n              +----------------+\n                       |\n        _______________|______________\n       |                              |\n  Code interpreter   +--------------------------------+\n       tool          |         Managed agent          |\n                     |      +------------------+      |\n                     |      | Web Search agent |      |\n                     |      +------------------+      |\n                     |         |            |         |\n                     |  Web Search tool     |         |\n                     |             Visit webpage tool |\n                     +--------------------------------+\n```\nआइए इस सिस्टम को सेट करें।\n\nआवश्यक डिपेंडेंसी इंस्टॉल करने के लिए नीचे दी गई लाइन चलाएं:\n\n```\n!pip install 'smolagents[toolkit]' --upgrade -q\n```\n\nHF Inference API को कॉल करने के लिए लॉगिन करें:\n\n```\nfrom huggingface_hub import login\n\nlogin()\n```\n\n⚡️ हमारा एजेंट [Qwen/Qwen3-Next-80B-A3B-Thinking](https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Thinking) द्वारा संचालित होगा जो `InferenceClientModel` क्लास का उपयोग करता है जो HF के Inference API का उपयोग करता है: Inference API किसी भी OS मॉडल को जल्दी और आसानी से चलाने की अनुमति देता है।\n\n_नोट:_ The Inference API विभिन्न मानदंडों के आधार पर मॉडल होस्ट करता है, और डिप्लॉय किए गए मॉडल बिना पूर्व सूचना के अपडेट या बदले जा सकते हैं। इसके बारे में अधिक जानें [यहां](https://huggingface.co/docs/api-inference/supported-models)।\n\n```py\nmodel_id = \"Qwen/Qwen3-Next-80B-A3B-Thinking\"\n```\n\n## 🔍 एक वेब सर्च टूल बनाएं\n\nवेब ब्राउज़िंग के लिए, हम पहले से मौजूद [`WebSearchTool`] टूल का उपयोग कर सकते हैं जो Google search के समान सुविधा प्रदान करता है।\n\nलेकिन फिर हमें `WebSearchTool` द्वारा खोजे गए पेज को देखने में भी सक्षम होने की आवश्यकता होगी।\nऐसा करने के लिए, हम लाइब्रेरी के बिल्ट-इन `VisitWebpageTool` को इम्पोर्ट कर सकते हैं, लेकिन हम इसे फिर से बनाएंगे यह देखने के लिए कि यह कैसे किया जाता है।\n\nतो आइए `markdownify` का उपयोग करके शुरू से अपना `VisitWebpageTool` टूल बनाएं।\n\n```py\nimport re\nimport requests\nfrom markdownify import markdownify\nfrom requests.exceptions import RequestException\nfrom smolagents import tool\n\n\n@tool\ndef visit_webpage(url: str) -> str:\n    \"\"\"Visits a webpage at the given URL and returns its content as a markdown string.\n\n    Args:\n        url: The URL of the webpage to visit.\n\n    Returns:\n        The content of the webpage converted to Markdown, or an error message if the request fails.\n    \"\"\"\n    try:\n        # Send a GET request to the URL\n        response = requests.get(url)\n        response.raise_for_status()  # Raise an exception for bad status codes\n\n        # Convert the HTML content to Markdown\n        markdown_content = markdownify(response.text).strip()\n\n        # Remove multiple line breaks\n        markdown_content = re.sub(r\"\\n{3,}\", \"\\n\\n\", markdown_content)\n\n        return markdown_content\n\n    except RequestException as e:\n        return f\"Error fetching the webpage: {str(e)}\"\n    except Exception as e:\n        return f\"An unexpected error occurred: {str(e)}\"\n```\n\nठीक है, अब चलिए हमारे टूल को टेस्ट करें!\n\n```py\nprint(visit_webpage(\"https://en.wikipedia.org/wiki/Hugging_Face\")[:500])\n```\n\n## हमारी मल्टी-एजेंट सिस्टम का निर्माण करें 🤖🤝🤖\n\nअब जब हमारे पास सभी टूल्स `search` और `visit_webpage` हैं, हम उनका उपयोग वेब एजेंट बनाने के लिए कर सकते हैं।\n\nइस एजेंट के लिए कौन सा कॉन्फ़िगरेशन चुनें?\n- वेब ब्राउज़िंग एक सिंगल-टाइमलाइन टास्क है जिसे समानांतर टूल कॉल की आवश्यकता नहीं है, इसलिए JSON टूल कॉलिंग इसके लिए अच्छी तरह काम करती है। इसलिए हम `ToolCallingAgent` चुनते हैं।\n- साथ ही, चूंकि कभी-कभी वेब सर्च में सही उत्तर खोजने से पहले कई पेजों की सर्च करने की आवश्यकता होती है, हम `max_steps` को बढ़ाकर 10 करना पसंद करते हैं।\n\n```py\nfrom smolagents import (\n    CodeAgent,\n    ToolCallingAgent,\n    InferenceClientModel,\n    ManagedAgent,\n    WebSearchTool,\n)\n\nmodel = InferenceClientModel(model_id=model_id)\n\nweb_agent = ToolCallingAgent(\n    tools=[WebSearchTool(), visit_webpage],\n    model=model,\n    max_steps=10,\n)\n```\n\nफिर हम इस एजेंट को एक `ManagedAgent` में रैप करते हैं जो इसे इसके मैनेजर एजेंट द्वारा कॉल करने योग्य बनाएगा।\n\n```py\nmanaged_web_agent = ManagedAgent(\n    agent=web_agent,\n    name=\"search\",\n    description=\"Runs web searches for you. Give it your query as an argument.\",\n)\n```\n\nअंत में हम एक मैनेजर एजेंट बनाते हैं, और इनिशियलाइजेशन पर हम अपने मैनेज्ड एजेंट को इसके `managed_agents` आर्गुमेंट में पास करते हैं।\n\nचूंकि यह एजेंट योजना बनाने और सोचने का काम करता है, उन्नत तर्क लाभदायक होगा, इसलिए `CodeAgent` सबसे अच्छा विकल्प होगा।\n\nसाथ ही, हम एक ऐसा प्रश्न पूछना चाहते हैं जिसमें वर्तमान वर्ष और अतिरिक्त डेटा गणना शामिल है: इसलिए आइए `additional_authorized_imports=[\"time\", \"numpy\", \"pandas\"]` जोड़ें, यदि एजेंट को इन पैकेजों की आवश्यकता हो।\n\n```py\nmanager_agent = CodeAgent(\n    tools=[],\n    model=model,\n    managed_agents=[managed_web_agent],\n    additional_authorized_imports=[\"time\", \"numpy\", \"pandas\"],\n)\n```\n\nबस इतना ही! अब चलिए हमारे सिस्टम को चलाते हैं! हम एक ऐसा प्रश्न चुनते हैं जिसमें गणना और शोध दोनों की आवश्यकता है।\n\n```py\nanswer = manager_agent.run(\"If LLM training continues to scale up at the current rhythm until 2030, what would be the electric power in GW required to power the biggest training runs by 2030? What would that correspond to, compared to some countries? Please provide a source for any numbers used.\")\n```\n\nWe get this report as the answer:\n```\nBased on current growth projections and energy consumption estimates, if LLM trainings continue to scale up at the \ncurrent rhythm until 2030:\n\n1. The electric power required to power the biggest training runs by 2030 would be approximately 303.74 GW, which \ntranslates to about 2,660,762 GWh/year.\n\n2. Comparing this to countries' electricity consumption:\n   - It would be equivalent to about 34% of China's total electricity consumption.\n   - It would exceed the total electricity consumption of India (184%), Russia (267%), and Japan (291%).\n   - It would be nearly 9 times the electricity consumption of countries like Italy or Mexico.\n\n3. Source of numbers:\n   - The initial estimate of 5 GW for future LLM training comes from AWS CEO Matt Garman.\n   - The growth projection used a CAGR of 79.80% from market research by Springs.\n   - Country electricity consumption data is from the U.S. Energy Information Administration, primarily for the year \n2021.\n```\n\nलगता है कि यदि [स्केलिंग हाइपोथिसिस](https://gwern.net/scaling-hypothesis) सत्य बनी रहती है तो हमें कुछ बड़े पावरप्लांट्स की आवश्यकता होगी।\n\nहमारे एजेंट्स ने कार्य को हल करने के लिए कुशलतापूर्वक सहयोग किया! ✅\n\n💡 आप इस ऑर्केस्ट्रेशन को आसानी से अधिक एजेंट्स में विस्तारित कर सकते हैं: एक कोड एक्जीक्यूशन करता है, एक वेब सर्च करता है, एक फाइल लोडिंग को संभालता है।\n"
  },
  {
    "path": "docs/source/hi/examples/rag.md",
    "content": "# एजेंटिक RAG\n\n[[open-in-colab]]\n\nरिट्रीवल-ऑगमेंटेड-जनरेशन (RAG) है \"एक यूजर के प्रश्न का उत्तर देने के लिए LLM का उपयोग करना, लेकिन उत्तर को एक नॉलेज बेस से प्राप्त जानकारी पर आधारित करना\"। इसमें वैनिला या फाइन-ट्यून्ड LLM का उपयोग करने की तुलना में कई फायदे हैं: कुछ नाम लेने के लिए, यह उत्तर को सत्य तथ्यों पर आधारित करने और काल्पनिक बातों को कम करने की अनुमति देता है, यह LLM को डोमेन-विशिष्ट ज्ञान प्रदान करने की अनुमति देता है, और यह नॉलेज बेस से जानकारी तक पहुंच का सूक्ष्म नियंत्रण प्रदान करता है।\n\nलेकिन वैनिला RAG की सीमाएं हैं, सबसे महत्वपूर्ण ये दो:\n- यह केवल एक रिट्रीवल स्टेप करता है: यदि परिणाम खराब हैं, तो जनरेशन भी बदले में खराब होगा।\n- सिमेंटिक समानता की गणना यूजर के प्रश्न को संदर्भ के रूप में करके की जाती है, जो अनुकूल नहीं हो सकती: उदाहरण के लिए, यूजर का प्रश्न अक्सर एक सवाल होगा, जबकि सही उत्तर देने वाला डॉक्यूमेंट सकारात्मक स्वर में हो सकता है, और इसका समानता स्कोर अन्य स्रोत दस्तावेज़ों की तुलना में कम हो सकता है, जो प्रश्नवाचक स्वर में हो सकते हैं। इससे संबंधित जानकारी को चूकने का जोखिम होता है।\n\nहम एक RAG एजेंट बनाकर इन समस्याओं को कम कर सकते हैं: बहुत सरल तरीके से, एक रिट्रीवर टूल से लैस एजेंट!\n\nयह एजेंट करेगा: ✅ स्वयं क्वेरी तैयार करेगा और ✅ आवश्यकता पड़ने पर पुनः-प्राप्ति के लिए समीक्षा करेगा।\n\nइसलिए यह सहज रूप से कुछ उन्नत RAG तकनीकों को प्राप्त कर लेना चाहिए!\n- सिमेंटिक खोज में सीधे यूजर क्वेरी का संदर्भ के रूप में उपयोग करने के बजाय, एजेंट स्वयं एक संदर्भ वाक्य तैयार करता है जो लक्षित डॉक्यूमेंट्स के करीब हो सकता है, जैसा कि [HyDE](https://huggingface.co/papers/2212.10496) में किया गया है।\nएजेंट जनरेट किए गए स्निपेट्स का उपयोग कर सकता है और आवश्यकता पड़ने पर पुनः-प्राप्ति कर सकता है, जैसा कि [Self-Query](https://docs.llamaindex.ai/en/stable/examples/evaluation/RetryQuery/) में किया गया है।\n\nचलिए इस सिस्टम को बनाते हैं। 🛠️\n\nआवश्यक डिपेंडेंसी इंस्टॉल करने के लिए नीचे दी गई लाइन चलाएं।\n```bash\n!pip install smolagents pandas langchain langchain-community sentence-transformers rank_bm25 --upgrade -q\n```\nHF Inference API को कॉल करने के लिए, आपको अपने एनवायरनमेंट वेरिएबल `HF_TOKEN` के रूप में एक वैध टोकन की आवश्यकता होगी।\nहम इसे लोड करने के लिए python-dotenv का उपयोग करते हैं।\n```py\nfrom dotenv import load_dotenv\nload_dotenv()\n```\n\nहम पहले एक नॉलेज बेस लोड करते हैं जिस पर हम RAG को लागू करना चाहते हैं: यह डेटा सेट Hugging Face के कई लाइब्रेरी के डॉक्यूमेंट पृष्ठों का संकलन है, जिन्हें Markdown में स्टोर किया गया है। हम केवल `transformers` लाइब्रेरी के दस्तावेज़ों को रखेंगे।\n\nफिर डेटासेट को प्रोसेस करके और इसे एक वेक्टर डेटाबेस में स्टोर करके नॉलेज बेस तैयार करें जिसे रिट्रीवर द्वारा उपयोग किया जाएगा।\n\nहम [LangChain](https://python.langchain.com/docs/introduction/) का उपयोग करते हैं क्योंकि इसमें उत्कृष्ट वेक्टर डेटाबेस उपयोगिताएं हैं।\n\n```py\nimport datasets\nfrom langchain.docstore.document import Document\nfrom langchain.text_splitter import RecursiveCharacterTextSplitter\nfrom langchain_community.retrievers import BM25Retriever\n\nknowledge_base = datasets.load_dataset(\"m-ric/huggingface_doc\", split=\"train\")\nknowledge_base = knowledge_base.filter(lambda row: row[\"source\"].startswith(\"huggingface/transformers\"))\n\nsource_docs = [\n    Document(page_content=doc[\"text\"], metadata={\"source\": doc[\"source\"].split(\"/\")[1]})\n    for doc in knowledge_base\n]\n\ntext_splitter = RecursiveCharacterTextSplitter(\n    chunk_size=500,\n    chunk_overlap=50,\n    add_start_index=True,\n    strip_whitespace=True,\n    separators=[\"\\n\\n\", \"\\n\", \".\", \" \", \"\"],\n)\ndocs_processed = text_splitter.split_documents(source_docs)\n```\n\nअब डॉक्यूमेंट्स तैयार हैं।\n\nतो चलिए अपना एजेंटिक RAG सिस्टम बनाएं!\n\n👉 हमें केवल एक RetrieverTool की आवश्यकता है जिसका उपयोग हमारा एजेंट नॉलेज बेस से जानकारी प्राप्त करने के लिए कर सकता है।\n\nचूंकि हमें टूल के एट्रीब्यूट के रूप में एक vectordb जोड़ने की आवश्यकता है, हम सरल टूल कंस्ट्रक्टर को `@tool` डेकोरेटर के साथ सीधे उपयोग नहीं कर सकते: इसलिए हम [tools tutorial](../tutorials/tools) में हाइलाइट किए गए सेटअप का पालन करेंगे।\n\n```py\nfrom smolagents import Tool\n\nclass RetrieverTool(Tool):\n    name = \"retriever\"\n    description = \"Uses semantic search to retrieve the parts of transformers documentation that could be most relevant to answer your query.\"\n    inputs = {\n        \"query\": {\n            \"type\": \"string\",\n            \"description\": \"The query to perform. This should be semantically close to your target documents. Use the affirmative form rather than a question.\",\n        }\n    }\n    output_type = \"string\"\n\n    def __init__(self, docs, **kwargs):\n        super().__init__(**kwargs)\n        self.retriever = BM25Retriever.from_documents(\n            docs, k=10\n        )\n\n    def forward(self, query: str) -> str:\n        assert isinstance(query, str), \"Your search query must be a string\"\n\n        docs = self.retriever.invoke(\n            query,\n        )\n        return \"\\nRetrieved documents:\\n\" + \"\".join(\n            [\n                f\"\\n\\n===== Document {str(i)} =====\\n\" + doc.page_content\n                for i, doc in enumerate(docs)\n            ]\n        )\n\nretriever_tool = RetrieverTool(docs_processed)\n```\nहमने BM25 का उपयोग किया है, जो एक क्लासिक रिट्रीवल विधि है,  क्योंकि इसे सेटअप करना बहुत आसान है।\nरिट्रीवल सटीकता में सुधार करने के लिए, आप BM25 को डॉक्यूमेंट्स के लिए वेक्टर प्रतिनिधित्व का उपयोग करके सिमेंटिक खोज से बदल सकते हैं: इस प्रकार आप एक अच्छा एम्बेडिंग मॉडल चुनने के लिए [MTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) पर जा सकते हैं।\n\nअब यह सीधा है कि एक एजेंट बनाया जाए जो इस `retriever_tool` का उपयोग करेगा!\n\n\nएजेंट को इनिशियलाइजेशन पर इन आर्गुमेंट्स की आवश्यकता होगी:\n- `tools`: टूल्स की एक सूची जिन्हें एजेंट कॉल कर सकेगा।\n- `model`: LLM जो एजेंट को पावर देता है।\nहमारा `model` एक कॉलेबल होना चाहिए जो इनपुट के रूप में संदेशों की एक सूची लेता है और टेक्स्ट लौटाता है। इसे एक stop_sequences आर्गुमेंट भी स्वीकार करने की आवश्यकता है जो बताता है कि जनरेशन कब रोकनी है। सुविधा के लिए, हम सीधे पैकेज में प्रदान की गई HfEngine क्लास का उपयोग करते हैं ताकि एक LLM इंजन मिल सके जो Hugging Face के Inference API को कॉल करता है।\n\nऔर हम [meta-llama/Llama-3.3-70B-Instruct](meta-llama/Llama-3.3-70B-Instruct) का उपयोग llm इंजन के रूप में करते हैं क्योंकि:\n- इसमें लंबा 128k कॉन्टेक्स्ट है, जो लंबे स्रोत दस्तावेजों को प्रोसेस करने में मददगार है\n- यह हर समय HF के Inference API पर मुफ्त में उपलब्ध है!\n\n_नोट:_ Inference API विभिन्न मानदंडों के आधार पर मॉडल होस्ट करता है, और डिप्लॉय किए गए मॉडल बिना पूर्व सूचना के अपडेट या बदले जा सकते हैं। इसके बारे में अधिक जानें [यहां](https://huggingface.co/docs/api-inference/supported-models) पढ़ें।\n\n```py\nfrom smolagents import InferenceClientModel, CodeAgent\n\nagent = CodeAgent(\n    tools=[retriever_tool], model=InferenceClientModel(model_id=\"meta-llama/Llama-3.3-70B-Instruct\"), max_steps=4, verbosity_level=2\n)\n```\n\nCodeAgent को इनिशियलाइज करने पर, इसे स्वचालित रूप से एक डिफ़ॉल्ट सिस्टम प्रॉम्प्ट दिया गया है जो LLM इंजन को चरण-दर-चरण प्रोसेस करने और कोड स्निपेट्स के रूप में टूल कॉल जनरेट करने के लिए कहता है, लेकिन आप आवश्यकतानुसार इस प्रॉम्प्ट टेम्पलेट को अपने से बदल सकते हैं।\n\nजब CodeAgent का `.run()` मेथड लॉन्च किया जाता है, तो एजेंट LLM इंजन को कॉल करने का कार्य करता है, और टूल कॉल्स को निष्पादित करता है, यह सब एक लूप में होता है, जो तब तक चलता है जब तक टूल final_answer के साथ अंतिम उत्तर के रूप में नहीं बुलाया जाता।\n\n```py\nagent_output = agent.run(\"For a transformers model training, which is slower, the forward or the backward pass?\")\n\nprint(\"Final output:\")\nprint(agent_output)\n```\n\n\n"
  },
  {
    "path": "docs/source/hi/examples/text_to_sql.md",
    "content": "# Text-to-SQL\n\n[[open-in-colab]]\n\nइस ट्यूटोरियल में, हम देखेंगे कि कैसे `smolagents` का उपयोग करके एक एजेंट को SQL का उपयोग करने के लिए लागू किया जा सकता है।\n\n> आइए सबसे महत्वपूर्ण प्रश्न से शुरू करें: इसे साधारण क्यों नहीं रखें और एक सामान्य text-to-SQL पाइपलाइन का उपयोग करें?\n\nएक सामान्य text-to-SQL पाइपलाइन कमजोर होती है, क्योंकि उत्पन्न SQL क्वेरी गलत हो सकती है। इससे भी बुरी बात यह है कि क्वेरी गलत हो सकती है, लेकिन कोई एरर नहीं दिखाएगी, बल्कि बिना किसी अलार्म के गलत/बेकार आउटपुट दे सकती है।\n\n\n👉 इसके बजाय, एक एजेंट सिस्टम आउटपुट का गंभीरता से निरीक्षण कर सकता है और तय कर सकता है कि क्वेरी को बदलने की जरूरत है या नहीं, इस प्रकार इसे बेहतर प्रदर्शन में मदद मिलती है।\n\nआइए इस एजेंट को बनाएं! 💪\n\nपहले, हम SQL एनवायरनमेंट सेटअप करते हैं:\n```py\nfrom sqlalchemy import (\n    create_engine,\n    MetaData,\n    Table,\n    Column,\n    String,\n    Integer,\n    Float,\n    insert,\n    inspect,\n    text,\n)\n\nengine = create_engine(\"sqlite:///:memory:\")\nmetadata_obj = MetaData()\n\n# create city SQL table\ntable_name = \"receipts\"\nreceipts = Table(\n    table_name,\n    metadata_obj,\n    Column(\"receipt_id\", Integer, primary_key=True),\n    Column(\"customer_name\", String(16), primary_key=True),\n    Column(\"price\", Float),\n    Column(\"tip\", Float),\n)\nmetadata_obj.create_all(engine)\n\nrows = [\n    {\"receipt_id\": 1, \"customer_name\": \"Alan Payne\", \"price\": 12.06, \"tip\": 1.20},\n    {\"receipt_id\": 2, \"customer_name\": \"Alex Mason\", \"price\": 23.86, \"tip\": 0.24},\n    {\"receipt_id\": 3, \"customer_name\": \"Woodrow Wilson\", \"price\": 53.43, \"tip\": 5.43},\n    {\"receipt_id\": 4, \"customer_name\": \"Margaret James\", \"price\": 21.11, \"tip\": 1.00},\n]\nfor row in rows:\n    stmt = insert(receipts).values(**row)\n    with engine.begin() as connection:\n        cursor = connection.execute(stmt)\n```\n\n### Agent बनाएं\n\nअब आइए हमारी SQL टेबल को एक टूल द्वारा पुनर्प्राप्त करने योग्य बनाएं। \n\nटूल का विवरण विशेषता एजेंट सिस्टम द्वारा LLM के prompt में एम्बेड किया जाएगा: यह LLM को टूल का उपयोग करने के बारे में जानकारी देता है। यहीं पर हम SQL टेबल का वर्णन करना चाहते हैं।\n\n```py\ninspector = inspect(engine)\ncolumns_info = [(col[\"name\"], col[\"type\"]) for col in inspector.get_columns(\"receipts\")]\n\ntable_description = \"Columns:\\n\" + \"\\n\".join([f\"  - {name}: {col_type}\" for name, col_type in columns_info])\nprint(table_description)\n```\n\n```text\nColumns:\n  - receipt_id: INTEGER\n  - customer_name: VARCHAR(16)\n  - price: FLOAT\n  - tip: FLOAT\n```\n\nअब आइए हमारा टूल बनाएं। इसे निम्नलिखित की आवश्यकता है: (अधिक जानकारी के लिए [टूल doc](../tutorials/tools) पढ़ें)\n- एक डॉकस्ट्रिंग जिसमें आर्ग्युमेंट्स की सूची वाला `Args:` भाग हो।\n- इनपुट और आउटपुट दोनों पर टाइप हिंट्स।\n\n```py\nfrom smolagents import tool\n\n@tool\ndef sql_engine(query: str) -> str:\n    \"\"\"\n    Allows you to perform SQL queries on the table. Returns a string representation of the result.\n    The table is named 'receipts'. Its description is as follows:\n        Columns:\n        - receipt_id: INTEGER\n        - customer_name: VARCHAR(16)\n        - price: FLOAT\n        - tip: FLOAT\n\n    Args:\n        query: The query to perform. This should be correct SQL.\n    \"\"\"\n    output = \"\"\n    with engine.connect() as con:\n        rows = con.execute(text(query))\n        for row in rows:\n            output += \"\\n\" + str(row)\n    return output\n```\n\nअब आइए एक एजेंट बनाएं जो इस टूल का लाभ उठाता है।\n\nहम `CodeAgent` का उपयोग करते हैं, जो smolagents का मुख्य एजेंट क्लास है: एक एजेंट जो कोड में एक्शन लिखता है और ReAct फ्रेमवर्क के अनुसार पिछले आउटपुट पर पुनरावृत्ति कर सकता है।\n\nमॉडल वह LLM है जो एजेंट सिस्टम को संचालित करता है। `InferenceClientModel` आपको HF के Inference API का उपयोग करके LLM को कॉल करने की अनुमति देता है, या तो सर्वरलेस या डेडिकेटेड एंडपॉइंट के माध्यम से, लेकिन आप किसी भी प्रोप्राइटरी API का भी उपयोग कर सकते हैं।\n\n```py\nfrom smolagents import CodeAgent, InferenceClientModel\n\nagent = CodeAgent(\n    tools=[sql_engine],\n    model=InferenceClientModel(model_id=\"meta-llama/Meta-Llama-3.1-8B-Instruct\"),\n)\nagent.run(\"Can you give me the name of the client who got the most expensive receipt?\")\n```\n\n### लेवल 2: टेबल जॉइन्स\n\nअब आइए इसे और चुनौतीपूर्ण बनाएं! हम चाहते हैं कि हमारा एजेंट कई टेबल्स के बीच जॉइन को संभाल सके। \n\nतो आइए हम प्रत्येक receipt_id के लिए वेटर्स के नाम रिकॉर्ड करने वाली एक दूसरी टेबल बनाते हैं!\n\n```py\ntable_name = \"waiters\"\nreceipts = Table(\n    table_name,\n    metadata_obj,\n    Column(\"receipt_id\", Integer, primary_key=True),\n    Column(\"waiter_name\", String(16), primary_key=True),\n)\nmetadata_obj.create_all(engine)\n\nrows = [\n    {\"receipt_id\": 1, \"waiter_name\": \"Corey Johnson\"},\n    {\"receipt_id\": 2, \"waiter_name\": \"Michael Watts\"},\n    {\"receipt_id\": 3, \"waiter_name\": \"Michael Watts\"},\n    {\"receipt_id\": 4, \"waiter_name\": \"Margaret James\"},\n]\nfor row in rows:\n    stmt = insert(receipts).values(**row)\n    with engine.begin() as connection:\n        cursor = connection.execute(stmt)\n```\nचूंकि हमने टेबल को बदल दिया है, हम LLM को इस टेबल की जानकारी का उचित उपयोग करने देने के लिए इस टेबल के विवरण के साथ `SQLExecutorTool` को अपडेट करते हैं।\n\n```py\nupdated_description = \"\"\"Allows you to perform SQL queries on the table. Beware that this tool's output is a string representation of the execution output.\nIt can use the following tables:\"\"\"\n\ninspector = inspect(engine)\nfor table in [\"receipts\", \"waiters\"]:\n    columns_info = [(col[\"name\"], col[\"type\"]) for col in inspector.get_columns(table)]\n\n    table_description = f\"Table '{table}':\\n\"\n\n    table_description += \"Columns:\\n\" + \"\\n\".join([f\"  - {name}: {col_type}\" for name, col_type in columns_info])\n    updated_description += \"\\n\\n\" + table_description\n\nprint(updated_description)\n```\nचूंकि यह रिक्वेस्ट पिछले वाले से थोड़ी कठिन है, हम LLM इंजन को अधिक शक्तिशाली [Qwen/Qwen3-Next-80B-A3B-Thinking](https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Thinking) का उपयोग करने के लिए स्विच करेंगे!\n\n```py\nsql_engine.description = updated_description\n\nagent = CodeAgent(\n    tools=[sql_engine],\n    model=InferenceClientModel(model_id=\"Qwen/Qwen3-Next-80B-A3B-Thinking\"),\n)\n\nagent.run(\"Which waiter got more total money from tips?\")\n```\nयह सीधे काम करता है! सेटअप आश्चर्यजनक रूप से सरल था, है ना?\n\nयह उदाहरण पूरा हो गया! हमने इन अवधारणाओं को छुआ है:\n- नए टूल्स का निर्माण।\n- टूल के विवरण को अपडेट करना।\n- एक मजबूत LLM में स्विच करने से एजेंट की तर्कशक्ति में मदद मिलती है।\n\n✅ अब आप वह text-to-SQL सिस्टम बना सकते हैं जिसका आपने हमेशा सपना देखा है! ✨"
  },
  {
    "path": "docs/source/hi/guided_tour.md",
    "content": "# Agents - गाइडेड टूर\n\n[[open-in-colab]]\n\nइस गाइडेड विजिट में, आप सीखेंगे कि एक एजेंट कैसे बनाएं, इसे कैसे चलाएं, और अपने यूज-केस के लिए बेहतर काम करने के लिए इसे कैसे कस्टमाइज़ करें।\n\n### अपना Agent बनाना\n\nएक मिनिमल एजेंट को इनिशियलाइज़ करने के लिए, आपको कम से कम इन दो आर्ग्यूमेंट्स की आवश्यकता है:\n\n- `model`, आपके एजेंट को पावर देने के लिए एक टेक्स्ट-जनरेशन मॉडल - क्योंकि एजेंट एक सिंपल LLM से अलग है, यह एक सिस्टम है जो LLM को अपने इंजन के रूप में उपयोग करता है। आप इनमें से कोई भी विकल्प उपयोग कर सकते हैं:\n    - [`TransformersModel`] `transformers` पाइपलाइन को पहले से इनिशियलाइज़ करता है जो `transformers` का उपयोग करके आपकी लोकल मशीन पर इन्फरेंस चलाने के लिए होता है।\n    - [`InferenceClientModel`] अंदर से `huggingface_hub.InferenceClient` का लाभ उठाता है।\n    - [`LiteLLMModel`] आपको [LiteLLM](https://docs.litellm.ai/) के माध्यम से 100+ अलग-अलग मॉडल्स को कॉल करने देता है!\n\n- `tools`, `Tools` की एक लिस्ट जिसे एजेंट टास्क को हल करने के लिए उपयोग कर सकता है। यह एक खाली लिस्ट हो सकती है। आप ऑप्शनल आर्ग्यूमेंट `add_base_tools=True` को परिभाषित करके अपनी `tools` लिस्ट के ऊपर डिफ़ॉल्ट टूलबॉक्स भी जोड़ सकते हैं।\n\nएक बार जब आपके पास ये दो आर्ग्यूमेंट्स, `tools` और `model` हैं, तो आप एक एजेंट बना सकते हैं और इसे चला सकते हैं। आप कोई भी LLM उपयोग कर सकते हैं, या तो [Hugging Face API](https://huggingface.co/docs/api-inference/en/index), [transformers](https://github.com/huggingface/transformers/), [ollama](https://ollama.com/), या [LiteLLM](https://www.litellm.ai/) के माध्यम से।\n\n<hfoptions id=\"एक LLM चुनें\">\n<hfoption id=\"Hugging Face API\">\n\nHugging Face API टोकन के बिना उपयोग करने के लिए मुफ्त है, लेकिन फिर इसमें रेट लिमिटेशन होगी।\n\nगेटेड मॉडल्स तक पहुंचने या PRO अकाउंट के साथ अपनी रेट लिमिट्स बढ़ाने के लिए, आपको एनवायरनमेंट वेरिएबल `HF_TOKEN` सेट करना होगा या `InferenceClientModel` के इनिशियलाइजेशन पर `token` वेरिएबल पास करना होगा।\n\n```python\nfrom smolagents import CodeAgent, InferenceClientModel\n\nmodel_id = \"meta-llama/Llama-3.3-70B-Instruct\"\n\nmodel = InferenceClientModel(model_id=model_id, token=\"<YOUR_HUGGINGFACEHUB_API_TOKEN>\")\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n</hfoption>\n<hfoption id=\"Local Transformers Model\">\n\n```python\nfrom smolagents import CodeAgent, TransformersModel\n\nmodel_id = \"meta-llama/Llama-3.2-3B-Instruct\"\n\nmodel = TransformersModel(model_id=model_id)\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n</hfoption>\n<hfoption id=\"OpenAI या Anthropic API\">\n\n`LiteLLMModel` का उपयोग करने के लिए, आपको एनवायरनमेंट वेरिएबल `ANTHROPIC_API_KEY` या `OPENAI_API_KEY` सेट करना होगा, या इनिशियलाइजेशन पर `api_key` वेरिएबल पास करना होगा।\n\n```python\nfrom smolagents import CodeAgent, LiteLLMModel\n\nmodel = LiteLLMModel(model_id=\"anthropic/claude-3-5-sonnet-latest\", api_key=\"YOUR_ANTHROPIC_API_KEY\") # Could use 'gpt-4o'\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n</hfoption>\n<hfoption id=\"Ollama\">\n\n```python\nfrom smolagents import CodeAgent, LiteLLMModel\n\nmodel = LiteLLMModel(\n    model_id=\"ollama_chat/llama3.2\", # This model is a bit weak for agentic behaviours though\n    api_base=\"http://localhost:11434\", # replace with 127.0.0.1:11434 or remote open-ai compatible server if necessary\n    api_key=\"YOUR_API_KEY\" # replace with API key if necessary\n    num_ctx=8192 # ollama default is 2048 which will fail horribly. 8192 works for easy tasks, more is better. Check https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator to calculate how much VRAM this will need for the selected model.\n)\n\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n</hfoption>\n</hfoptions>\n\n#### CodeAgent और ToolCallingAgent\n\n[`CodeAgent`] हमारा डिफ़ॉल्ट एजेंट है। यह हर स्टेप पर पायथन कोड स्निपेट्स लिखेगा और एक्जीक्यूट करेगा।\n\nडिफ़ॉल्ट रूप से, एक्जीक्यूशन आपके लोकल एनवायरनमेंट में किया जाता है।\nयह सुरक्षित होना चाहिए क्योंकि केवल वही फ़ंक्शंस कॉल किए जा सकते हैं जो आपने प्रदान किए हैं (विशेष रूप से यदि यह केवल Hugging Face टूल्स हैं) और पूर्व-परिभाषित सुरक्षित फ़ंक्शंस जैसे `print` या `math` मॉड्यूल से फ़ंक्शंस, इसलिए आप पहले से ही सीमित हैं कि क्या एक्जीक्यूट किया जा सकता है।\n\nपायथन इंटरप्रेटर डिफ़ॉल्ट रूप से सेफ लिस्ट के बाहर इम्पोर्ट की अनुमति नहीं देता है, इसलिए सबसे स्पष्ट अटैक समस्या नहीं होनी चाहिए।\nआप अपने [`CodeAgent`] के इनिशियलाइजेशन पर आर्ग्यूमेंट `additional_authorized_imports` में स्ट्रिंग्स की लिस्ट के रूप में अतिरिक्त मॉड्यूल्स को अधिकृत कर सकते हैं।\n\n```py\nmodel = InferenceClientModel()\nagent = CodeAgent(tools=[], model=model, additional_authorized_imports=['requests', 'bs4'])\nagent.run(\"Could you get me the title of the page at url 'https://huggingface.co/blog'?\")\n```\n\n> [!WARNING]\n> LLM आर्बिट्ररी कोड जनरेट कर सकता है जो फिर एक्जीक्यूट किया जाएगा: कोई असुरक्षित इम्पोर्ट न जोड़ें!\n\nएक्जीक्यूशन किसी भी कोड पर रुक जाएगा जो एक अवैध ऑपरेशन करने का प्रयास करता है या यदि एजेंट द्वारा जनरेट किए गए कोड में एक रेगुलर पायथन एरर है।\n\nआप [E2B कोड एक्जीक्यूटर](https://e2b.dev/docs#what-is-e2-b) या Docker का उपयोग लोकल पायथन इंटरप्रेटर के बजाय कर सकते हैं। E2B के लिए, पहले [`E2B_API_KEY` एनवायरनमेंट वेरिएबल सेट करें](https://e2b.dev/dashboard?tab=keys) और फिर एजेंट इनिशियलाइजेशन पर `executor_type=\"e2b\"` पास करें। Docker के लिए, इनिशियलाइजेशन के दौरान `executor_type=\"docker\"` पास करें।\n\n> [!TIP]\n> कोड एक्जीक्यूशन के बारे में और जानें [इस ट्यूटोरियल में](tutorials/secure_code_execution)।\n\nहम JSON-जैसे ब्लॉब्स के रूप में एक्शन लिखने के व्यापक रूप से उपयोग किए जाने वाले तरीके का भी समर्थन करते हैं: यह [`ToolCallingAgent`] है, यह बहुत कुछ [`CodeAgent`] की तरह ही काम करता है, बेशक `additional_authorized_imports` के बिना क्योंकि यह कोड एक्जीक्यूट नहीं करता।\n\n```py\nfrom smolagents import ToolCallingAgent, WebSearchTool\n\nagent = ToolCallingAgent(tools=[WebSearchTool()], model=model)\nagent.run(\"Could you get me the title of the page at url 'https://huggingface.co/blog'?\")\n```\n\n### एजेंट रन का निरीक्षण\n\nरन के बाद क्या हुआ यह जांचने के लिए यहाँ कुछ उपयोगी एट्रिब्यूट्स हैं:\n- `agent.logs` एजेंट के फाइन-ग्रेन्ड लॉग्स को स्टोर करता है। एजेंट के रन के हर स्टेप पर, सब कुछ एक डिक्शनरी में स्टोर किया जाता है जो फिर `agent.logs` में जोड़ा जाता है।\n- `agent.write_memory_to_messages()` चलाने से LLM के लिए एजेंट के लॉग्स की एक इनर मेमोरी बनती है, चैट मैसेज की लिस्ट के रूप में। यह मेथड लॉग के प्रत्येक स्टेप पर जाता है और केवल वही स्टोर करता है जिसमें यह एक मैसेज के रूप में रुचि रखता है: उदाहरण के लिए, यह सिस्टम प्रॉम्प्ट और टास्क को अलग-अलग मैसेज के रूप में सेव करेगा, फिर प्रत्येक स्टेप के लिए यह LLM आउटपुट को एक मैसेज के रूप में और टूल कॉल आउटपुट को दूसरे मैसेज के रूप में स्टोर करेगा।\n\n## टूल्स\n\nटूल एक एटॉमिक फ़ंक्शन है जिसे एजेंट द्वारा उपयोग किया जाता है। LLM द्वारा उपयोग किए जाने के लिए, इसे कुछ एट्रिब्यूट्स की भी आवश्यकता होती है जो इसकी API बनाते हैं और LLM को यह बताने के लिए उपयोग किए जाएंगे कि इस टूल को कैसे कॉल करें:\n- एक नाम\n- एक विवरण\n- इनपुट प्रकार और विवरण\n- एक आउटपुट प्रकार\n\nआप उदाहरण के लिए [`PythonInterpreterTool`] को चेक कर सकते हैं: इसमें एक नाम, विवरण, इनपुट विवरण, एक आउटपुट प्रकार, और एक्शन करने के लिए एक `forward` मेथड है।\n\nजब एजेंट इनिशियलाइज़ किया जाता है, टूल एट्रिब्यूट्स का उपयोग एक टूल विवरण जनरेट करने के लिए किया जाता है जो एजेंट के सिस्टम प्रॉम्प्ट में बेक किया जाता है। यह एजेंट को बताता है कि वह कौन से टूल्स उपयोग कर सकता है और क्यों।\n\n### डिफ़ॉल्ट टूलबॉक्स\n\n`smolagents` एजेंट्स को सशक्त बनाने के लिए एक डिफ़ॉल्ट टूलबॉक्स के साथ आता है, जिसे आप आर्ग्यूमेंट `add_base_tools=True` के साथ अपने एजेंट में इनिशियलाइजेशन पर जोड़ सकते हैं:\n\n- **DuckDuckGo वेब सर्च**: DuckDuckGo ब्राउज़र का उपयोग करके वेब सर्च करता है।\n- **पायथन कोड इंटरप्रेटर**: आपका LLM जनरेटेड पायथन कोड एक सुरक्षित एनवायरनमेंट में चलाता है। यह टूल [`ToolCallingAgent`] में केवल तभी जोड़ा जाएगा जब आप इसे `add_base_tools=True` के साथ इनिशियलाइज़ करते हैं, क्योंकि कोड-बेस्ड एजेंट पहले से ही नेटिव रूप से पायथन कोड एक्जीक्यूट कर सकता है\n- **ट्रांसक्राइबर**: Whisper-Turbo पर बनाया गया एक स्पीच-टू-टेक्स्ट पाइपलाइन जो ऑडियो को टेक्स्ट में ट्रांसक्राइब करता है।\n\nआप मैन्युअल रूप से एक टूल का उपयोग उसके आर्ग्यूमेंट्स के साथ कॉल करके कर सकते हैं।\n\n```python\nfrom smolagents import WebSearchTool\n\nsearch_tool = WebSearchTool()\nprint(search_tool(\"Who's the current president of Russia?\"))\n```\n\n### अपने कस्टम टूल बनाएं  \n\nआप ऐसे उपयोग के मामलों के लिए अपने खुद के टूल बना सकते हैं जो Hugging Face के डिफ़ॉल्ट टूल्स द्वारा कवर नहीं किए गए हैं।  \nउदाहरण के लिए, चलिए एक टूल बनाते हैं जो दिए गए कार्य (task) के लिए हब से सबसे अधिक डाउनलोड किए गए मॉडल को रिटर्न करता है।  \n\nआप नीचे दिए गए कोड से शुरुआत करेंगे। \n\n```python\nfrom huggingface_hub import list_models\n\ntask = \"text-classification\"\n\nmost_downloaded_model = next(iter(list_models(filter=task, sort=\"downloads\", direction=-1)))\nprint(most_downloaded_model.id)\n```\n\nयह कोड आसानी से टूल में बदला जा सकता है, बस इसे एक फ़ंक्शन में रैप करें और `tool` डेकोरेटर जोड़ें:  \nयह टूल बनाने का एकमात्र तरीका नहीं है: आप इसे सीधे [`Tool`] का सबक्लास बनाकर भी परिभाषित कर सकते हैं, जो आपको अधिक लचीलापन प्रदान करता है, जैसे भारी क्लास एट्रिब्यूट्स को इनिशियलाइज़ करने की संभावना।  \n\nचलो देखते हैं कि यह दोनों विकल्पों के लिए कैसे काम करता है:\n\n<hfoptions id=\"build-a-tool\">\n<hfoption id=\"@tool के साथ एक फ़ंक्शन को डेकोरेट करें\">\n\n```py\nfrom smolagents import tool\n\n@tool\ndef model_download_tool(task: str) -> str:\n    \"\"\"\n    This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub.\n    It returns the name of the checkpoint.\n\n    Args:\n        task: The task for which to get the download count.\n    \"\"\"\n    most_downloaded_model = next(iter(list_models(filter=task, sort=\"downloads\", direction=-1)))\n    return most_downloaded_model.id\n```\n\nफ़ंक्शन को चाहिए:  \n- एक स्पष्ट नाम: नाम टूल के कार्य को स्पष्ट रूप से बताने वाला होना चाहिए ताकि इसे चलाने वाले LLM को आसानी हो। चूंकि यह टूल कार्य के लिए सबसे अधिक डाउनलोड किए गए मॉडल को लौटाता है, इसका नाम `model_download_tool` रखा गया है।  \n- इनपुट और आउटपुट पर टाइप हिंट्स।\n- एक विवरण: इसमें 'Args:' भाग शामिल होना चाहिए, जिसमें प्रत्येक आर्ग्युमेंट का वर्णन (बिना टाइप संकेत के) किया गया हो। यह विवरण एक निर्देश मैनुअल की तरह होता है जो LLM को टूल चलाने में मदद करता है। इसे अनदेखा न करें।  \nइन सभी तत्वों को एजेंट की सिस्टम प्रॉम्प्ट में स्वचालित रूप से शामिल किया जाएगा: इसलिए इन्हें यथासंभव स्पष्ट बनाने का प्रयास करें!  \n\n> [!TIP]  \n> यह परिभाषा प्रारूप `apply_chat_template` में उपयोग की गई टूल स्कीमा जैसा ही है, केवल अतिरिक्त `tool` डेकोरेटर जोड़ा गया है: हमारे टूल उपयोग API के बारे में अधिक पढ़ें [यहाँ](https://huggingface.co/blog/unified-tool-use#passing-tools-to-a-chat-template)।  \n\n\nआप सीधे अपने एजेंट को इनिशियलाइज़ कर सकते हैं:  \n```py\nfrom smolagents import CodeAgent, InferenceClientModel\nagent = CodeAgent(tools=[model_download_tool], model=InferenceClientModel())\nagent.run(\n    \"Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub?\"\n)\n```\n</hfoption>\n<hfoption id=\"सबक्लास टूल\">\n\n```py\nfrom smolagents import Tool\n\nclass ModelDownloadTool(Tool):\n    name = \"model_download_tool\"\n    description = \"This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub. It returns the name of the checkpoint.\"\n    inputs = {\"task\": {\"type\": \"string\", \"description\": \"The task for which to get the download count.\"}}\n    output_type = \"string\"\n\n    def forward(self, task: str) -> str:\n        most_downloaded_model = next(iter(list_models(filter=task, sort=\"downloads\", direction=-1)))\n        return most_downloaded_model.id\n```\n\nसबक्लास को निम्नलिखित एट्रिब्यूट्स की आवश्यकता होती है:  \n- एक स्पष्ट `name`: नाम टूल के कार्य को स्पष्ट रूप से बताने वाला होना चाहिए।  \n- एक `description`: यह भी LLM के लिए निर्देश मैनुअल की तरह काम करता है।  \n- इनपुट प्रकार और उनके विवरण।  \n- आउटपुट प्रकार।  \nइन सभी एट्रिब्यूट्स को एजेंट की सिस्टम प्रॉम्प्ट में स्वचालित रूप से शामिल किया जाएगा, इन्हें स्पष्ट और विस्तृत बनाएं।  \n\n\nआप सीधे अपने एजेंट को इनिशियलाइज़ कर सकते हैं:  \n```py\nfrom smolagents import CodeAgent, InferenceClientModel\nagent = CodeAgent(tools=[ModelDownloadTool()], model=InferenceClientModel())\nagent.run(\n    \"Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub?\"\n)\n```\n</hfoption>\n</hfoptions>\n\nलॉग्स इस प्रकार होंगे:  \n```text\n╭──────────────────────────────────────── New run ─────────────────────────────────────────╮\n│                                                                                          │\n│ Can you give me the name of the model that has the most downloads in the 'text-to-video' │\n│ task on the Hugging Face Hub?                                                            │\n│                                                                                          │\n╰─ InferenceClientModel - Qwen/Qwen2.5-Coder-32B-Instruct ───────────────────────────────────────────╯\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n╭─ Executing this code: ───────────────────────────────────────────────────────────────────╮\n│   1 model_name = model_download_tool(task=\"text-to-video\")                               │\n│   2 print(model_name)                                                                    │\n╰──────────────────────────────────────────────────────────────────────────────────────────╯\nExecution logs:\nByteDance/AnimateDiff-Lightning\n\nOut: None\n[Step 0: Duration 0.27 seconds| Input tokens: 2,069 | Output tokens: 60]\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n╭─ Executing this code: ───────────────────────────────────────────────────────────────────╮\n│   1 final_answer(\"ByteDance/AnimateDiff-Lightning\")                                      │\n╰──────────────────────────────────────────────────────────────────────────────────────────╯\nOut - Final answer: ByteDance/AnimateDiff-Lightning\n[Step 1: Duration 0.10 seconds| Input tokens: 4,288 | Output tokens: 148]\nOut[20]: 'ByteDance/AnimateDiff-Lightning'\n```\n\n [!TIP]  \n> टूल्स के बारे में अधिक पढ़ें [dedicated tutorial](./tutorials/tools#टूल-क्या-है-और-इसे-कैसे-बनाएं) में।  \n\n## मल्टी-एजेंट्स  \n\nMicrosoft के फ्रेमवर्क [Autogen](https://huggingface.co/papers/2308.08155) के साथ मल्टी-एजेंट सिस्टम्स की शुरुआत हुई।  \n\nइस प्रकार के फ्रेमवर्क में, आपके कार्य को हल करने के लिए कई एजेंट्स एक साथ काम करते हैं, न कि केवल एक।  \nयह अधिकांश बेंचमार्क्स पर बेहतर प्रदर्शन देता है। इसका कारण यह है कि कई कार्यों के लिए, एक सर्व-समावेशी प्रणाली के बजाय, आप उप-कार्यों पर विशेषज्ञता रखने वाली इकाइयों को पसंद करेंगे।  इस तरह, अलग-अलग टूल सेट्स और मेमोरी वाले एजेंट्स के पास विशेषकरण की अधिक कुशलता होती है। उदाहरण के लिए, कोड उत्पन्न करने वाले एजेंट की मेमोरी को वेब सर्च एजेंट द्वारा देखे गए वेबपेजों की सभी सामग्री से क्यों भरें? इन्हें अलग रखना बेहतर है।  \n\nआप `smolagents` का उपयोग करके आसानी से श्रेणीबद्ध मल्टी-एजेंट सिस्टम्स बना सकते हैं।  \n\nऐसा करने के लिए, एजेंट को [`ManagedAgent`] ऑब्जेक्ट में समाहित करें। यह ऑब्जेक्ट `agent`, `name`, और एक `description` जैसे तर्कों की आवश्यकता होती है, जो फिर मैनेजर एजेंट की सिस्टम प्रॉम्प्ट में एम्बेड किया जाता है  \n\nयहां एक एजेंट बनाने का उदाहरण दिया गया है जो हमारे [`WebSearchTool`] का उपयोग करके एक विशिष्ट वेब खोज एजेंट को प्रबंधित करता है।\n\n```py\nfrom smolagents import CodeAgent, InferenceClientModel, WebSearchTool, ManagedAgent\n\nmodel = InferenceClientModel()\n\nweb_agent = CodeAgent(tools=[WebSearchTool()], model=model)\n\nmanaged_web_agent = ManagedAgent(\n    agent=web_agent,\n    name=\"web_search\",\n    description=\"Runs web searches for you. Give it your query as an argument.\"\n)\n\nmanager_agent = CodeAgent(\n    tools=[], model=model, managed_agents=[managed_web_agent]\n)\n\nmanager_agent.run(\"Who is the CEO of Hugging Face?\")\n```\n\n> [!TIP]\n> कुशल मल्टी-एजेंट इंप्लीमेंटेशन का एक विस्तृत उदाहरण देखने के लिए, [कैसे हमने अपने मल्टी-एजेंट सिस्टम को GAIA लीडरबोर्ड के शीर्ष पर पहुंचाया](https://huggingface.co/blog/beating-gaia) पर जाएं।  \n\n\n## अपने एजेंट से बात करें और उसके विचारों को एक शानदार Gradio इंटरफेस में विज़ुअलाइज़ करें  \n\nआप `GradioUI` का उपयोग करके अपने एजेंट को इंटरैक्टिव तरीके से कार्य सौंप सकते हैं और उसके सोचने और निष्पादन की प्रक्रिया को देख सकते हैं। नीचे एक उदाहरण दिया गया है:\n\n```py\nfrom smolagents import (\n    load_tool,\n    CodeAgent,\n    InferenceClientModel,\n    GradioUI\n)\n\n# Import tool from Hub\nimage_generation_tool = load_tool(\"m-ric/text-to-image\", trust_remote_code=True)\n\nmodel = InferenceClientModel(model_id=model_id)\n\n# Initialize the agent with the image generation tool\nagent = CodeAgent(tools=[image_generation_tool], model=model)\n\nGradioUI(agent).launch()\n```\n\nअंदरूनी तौर पर, जब यूजर एक नया उत्तर टाइप करता है, तो एजेंट को `agent.run(user_request, reset=False)` के साथ लॉन्च किया जाता है।  \nयहाँ `reset=False` फ्लैग का मतलब है कि एजेंट की मेमोरी इस नए कार्य को लॉन्च करने से पहले क्लियर नहीं होती, जिससे बातचीत जारी रहती है।  \n\nआप इस `reset=False` आर्ग्युमेंट का उपयोग किसी भी अन्य एजेंटिक एप्लिकेशन में बातचीत जारी रखने के लिए कर सकते हैं।  \n\n## अगले कदम  \n\nअधिक गहन उपयोग के लिए, आप हमारे ट्यूटोरियल्स देख सकते हैं:  \n- [हमारे कोड एजेंट्स कैसे काम करते हैं इसका विवरण](./tutorials/secure_code_execution)  \n- [अच्छे एजेंट्स बनाने के लिए यह गाइड](./tutorials/building_good_agents)  \n- [टूल उपयोग के लिए इन-डेप्थ गाइड ](./tutorials/building_good_agents)।  \n"
  },
  {
    "path": "docs/source/hi/index.md",
    "content": "# `smolagents`\n\n<div class=\"flex justify-center\">\n    <img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/license_to_call.png\" width=100%/>\n</div>\n\nयह लाइब्रेरी पावरफुल एजेंट्स बनाने के लिए सबसे सरल फ्रेमवर्क है! वैसे, \"एजेंट्स\" हैं क्या? हम अपनी परिभाषा [इस पेज पर](conceptual_guides/intro_agents) प्रदान करते हैं, जहाँ आपको यह भी पता चलेगा कि इन्हें कब उपयोग करें या न करें (स्पॉइलर: आप अक्सर एजेंट्स के बिना बेहतर काम कर सकते हैं)।\n\nयह लाइब्रेरी प्रदान करती है:\n\n✨ **सरलता**: Agents का लॉजिक लगभग एक हजार लाइन्स ऑफ़ कोड में समाहित है। हमने रॉ कोड के ऊपर एब्स्ट्रैक्शन को न्यूनतम आकार में रखा है!\n\n🌐 **सभी LLM के लिए सपोर्ट**: यह हब पर होस्ट किए गए मॉडल्स को उनके `transformers` वर्जन में या हमारे इन्फरेंस API के माध्यम से सपोर्ट करता है, साथ ही OpenAI, Anthropic से भी... किसी भी LLM से एजेंट को पावर करना वास्तव में आसान है।\n\n🧑‍💻 **कोड Agents के लिए फर्स्ट-क्लास सपोर्ट**, यानी ऐसे एजेंट्स जो अपनी एक्शन्स को कोड में लिखते हैं (कोड लिखने के लिए उपयोग किए जाने वाले एजेंट्स के विपरीत), [यहाँ और पढ़ें](tutorials/secure_code_execution)।\n\n🤗 **हब इंटीग्रेशन**: आप टूल्स को हब पर शेयर और लोड कर सकते हैं, और आगे और भी बहुत कुछ आने वाला है!\n!\n\n<div class=\"mt-10\">\n  <div class=\"w-full flex flex-col space-y-4 md:space-y-0 md:grid md:grid-cols-2 md:gap-y-4 md:gap-x-5\">\n    <a class=\"!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg\" href=\"./guided_tour\"\n      ><div class=\"w-full text-center bg-gradient-to-br from-blue-400 to-blue-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed\">गाइडेड टूर</div>\n      <p class=\"text-gray-700\">बेसिक्स सीखें और एजेंट्स का उपयोग करने में परिचित हों। यदि आप पहली बार एजेंट्स का उपयोग कर रहे हैं तो यहाँ से शुरू करें!</p>\n    </a>\n    <a class=\"!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg\" href=\"./examples/text_to_sql\"\n      ><div class=\"w-full text-center bg-gradient-to-br from-indigo-400 to-indigo-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed\">हाउ-टू गाइड्स</div>\n      <p class=\"text-gray-700\">एक विशिष्ट लक्ष्य प्राप्त करने में मदद के लिए गाइड: SQL क्वेरी जनरेट और टेस्ट करने के लिए एजेंट बनाएं!</p>\n    </a>\n    <a class=\"!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg\" href=\"./conceptual_guides/intro_agents\"\n      ><div class=\"w-full text-center bg-gradient-to-br from-pink-400 to-pink-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed\">कॉन्सेप्चुअल गाइड्स</div>\n      <p class=\"text-gray-700\">महत्वपूर्ण विषयों की बेहतर समझ बनाने के लिए उच्च-स्तरीय व्याख्याएं।</p>\n   </a>\n    <a class=\"!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg\" href=\"./tutorials/building_good_agents\"\n      ><div class=\"w-full text-center bg-gradient-to-br from-purple-400 to-purple-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed\">ट्यूटोरियल्स</div>\n      <p class=\"text-gray-700\">एजेंट्स बनाने के महत्वपूर्ण पहलुओं को कवर करने वाले क्ट्यूटोरियल्स।</p>\n    </a>\n  </div>\n</div>"
  },
  {
    "path": "docs/source/hi/reference/agents.md",
    "content": "# Agents\n\n<Tip warning={true}>\n\nSmolagents एक experimental API है जो किसी भी समय बदल सकता है। एजेंट्स द्वारा लौटाए गए परिणाम भिन्न हो सकते हैं क्योंकि APIs या underlying मॉडल बदलने की संभावना रखते हैं।\n\n</Tip>\n\nAgents और tools के बारे में अधिक जानने के लिए [introductory guide](../index) पढ़ना सुनिश्चित करें। \nयह पेज underlying क्लासेज के लिए API docs को शामिल करता है।\n\n## Agents\n\nहमारे एजेंट्स [`MultiStepAgent`] से इनहेरिट करते हैं, जिसका अर्थ है कि वे कई चरणों में कार्य कर सकते हैं, प्रत्येक चरण में एक विचार, फिर एक टूल कॉल और एक्जीक्यूशन शामिल होता है। [इस कॉन्सेप्चुअल गाइड](../conceptual_guides/react) में अधिक पढ़ें।\n\nहम मुख्य [`Agent`] क्लास पर आधारित दो प्रकार के एजेंट्स प्रदान करते हैं।\n  - [`CodeAgent`] डिफ़ॉल्ट एजेंट है, यह अपने टूल कॉल्स को Python कोड में लिखता है।\n  - [`ToolCallingAgent`] अपने टूल कॉल्स को JSON में लिखता है।\n\nदोनों को इनिशियलाइजेशन पर `model` और टूल्स की सूची `tools` आर्गुमेंट्स की आवश्यकता होती है।\n\n### Agents की क्लासेज\n\n[[autodoc]] MultiStepAgent\n\n[[autodoc]] CodeAgent\n\n[[autodoc]] ToolCallingAgent\n\n### stream_to_gradio\n\n[[autodoc]] stream_to_gradio\n\n### GradioUI\n\n[[autodoc]] GradioUI\n\n## मॉडल्स\n\nआप स्वतंत्र रूप से अपने स्वयं के मॉडल बना सकते हैं और उनका उपयोग कर सकते हैं।\n\nआप अपने एजेंट के लिए कोई भी `model` कॉल करने योग्य उपयोग कर सकते हैं, जब तक कि:\n1. यह अपने इनपुट `messages` के लिए [messages format](./chat_templating) (`List[Dict[str, str]]`) का पालन करता है, और यह एक `str` लौटाता है।\n2. यह आर्गुमेंट `stop_sequences` में पास किए गए सीक्वेंस से *पहले* आउटपुट जनरेट करना बंद कर देता है।\n\nअपने LLM को परिभाषित करने के लिए, आप एक `custom_model` मेथड बना सकते हैं जो [messages](./chat_templating) की एक सूची स्वीकार करता है और टेक्स्ट युक्त .content विशेषता वाला एक ऑब्जेक्ट लौटाता है। इस कॉलेबल को एक `stop_sequences` आर्गुमेंट भी स्वीकार करने की आवश्यकता होती है जो बताता है कि कब जनरेट करना और बंद करना है।\n\n```python\nfrom huggingface_hub import login, InferenceClient\n\nlogin(\"<YOUR_HUGGINGFACEHUB_API_TOKEN>\")\n\nmodel_id = \"meta-llama/Llama-3.3-70B-Instruct\"\n\nclient = InferenceClient(model=model_id)\n\ndef custom_model(messages, stop_sequences=[\"Task\"]):\n    response = client.chat_completion(messages, stop=stop_sequences, max_tokens=1000)\n    answer = response.choices[0].message\n    return answer\n```\n\nइसके अतिरिक्त, `custom_model` एक `grammar` आर्गुमेंट भी ले सकता है। जिस स्थिति में आप एजेंट इनिशियलाइजेशन पर एक `grammar` निर्दिष्ट करते हैं, यह आर्गुमेंट मॉडल के कॉल्स को आपके द्वारा इनिशियलाइजेशन पर परिभाषित `grammar` के साथ पास किया जाएगा, ताकि [constrained generation](https://huggingface.co/docs/text-generation-inference/conceptual/guidance) की अनुमति मिल सके जिससे उचित-फॉर्मेटेड एजेंट आउटपुट को फोर्स किया जा सके।\n\n### TransformersModel\n\nसुविधा के लिए, हमने एक `TransformersModel` जोड़ा है जो इनिशियलाइजेशन पर दिए गए model_id के लिए एक लोकल `transformers` पाइपलाइन बनाकर ऊपर के बिंदुओं को लागू करता है।\n\n```python\nfrom smolagents import TransformersModel\n\nmodel = TransformersModel(model_id=\"HuggingFaceTB/SmolLM-135M-Instruct\")\n\nprint(model([{\"role\": \"user\", \"content\": \"Ok!\"}], stop_sequences=[\"great\"]))\n```\n```text\n>>> What a\n```\n\n[[autodoc]] TransformersModel\n\n### InferenceClientModel\n\n`InferenceClientModel` LLM के एक्जीक्यूशन के लिए [HF Inference API](https://huggingface.co/docs/api-inference/index) क्लाइंट को रैप करता है।\n\n```python\nfrom smolagents import InferenceClientModel\n\nmessages = [\n  {\"role\": \"user\", \"content\": \"Hello, how are you?\"},\n  {\"role\": \"assistant\", \"content\": \"I'm doing great. How can I help you today?\"},\n  {\"role\": \"user\", \"content\": \"No need to help, take it easy.\"},\n]\n\nmodel = InferenceClientModel()\nprint(model(messages))\n```\n```text\n>>> Of course! If you change your mind, feel free to reach out. Take care!\n```\n[[autodoc]] InferenceClientModel\n\n### LiteLLMModel\n\n`LiteLLMModel` विभिन्न प्रदाताओं से 100+ LLMs को सपोर्ट करने के लिए [LiteLLM](https://www.litellm.ai/) का लाभ उठाता है।\nआप मॉडल इनिशियलाइजेशन पर kwargs पास कर सकते हैं जो तब मॉडल का उपयोग करते समय प्रयोग किए जाएंगे, उदाहरण के लिए नीचे हम `temperature` पास करते हैं।\n\n```python\nfrom smolagents import LiteLLMModel\n\nmessages = [\n  {\"role\": \"user\", \"content\": \"Hello, how are you?\"},\n  {\"role\": \"assistant\", \"content\": \"I'm doing great. How can I help you today?\"},\n  {\"role\": \"user\", \"content\": \"No need to help, take it easy.\"},\n]\n\nmodel = LiteLLMModel(model_id=\"anthropic/claude-3-5-sonnet-latest\", temperature=0.2, max_tokens=10)\nprint(model(messages))\n```\n\n[[autodoc]] LiteLLMModel\n\n### OpenAiModel\n\n\nयह क्लास आपको किसी भी OpenAIServer कम्पैटिबल मॉडल को कॉल करने देती है।\nयहाँ बताया गया है कि आप इसे कैसे सेट कर सकते हैं (आप दूसरे सर्वर को पॉइंट करने के लिए `api_base` url को कस्टमाइज़ कर सकते हैं):\n```py\nimport os\nfrom smolagents import OpenAIModel\n\nmodel = OpenAIModel(\n    model_id=\"gpt-4o\",\n    api_base=\"https://api.openai.com/v1\",\n    api_key=os.environ[\"OPENAI_API_KEY\"],\n)\n```\n\n## Prompts\n\n[[autodoc]] smolagents.agents.PromptTemplates\n\n[[autodoc]] smolagents.agents.PlanningPromptTemplate\n\n[[autodoc]] smolagents.agents.ManagedAgentPromptTemplate\n\n[[autodoc]] smolagents.agents.FinalAnswerPromptTemplate\n"
  },
  {
    "path": "docs/source/hi/reference/tools.md",
    "content": "# Tools\n\n<Tip warning={true}>\n\nSmolagents एक experimental API है जो किसी भी समय बदल सकता है। एजेंट्स द्वारा लौटाए गए परिणाम भिन्न हो सकते हैं क्योंकि APIs या underlying मॉडल बदलने की संभावना रखते हैं।\n\n</Tip>\n\nएजेंट्स और टूल्स के बारे में अधिक जानने के लिए [introductory guide](../index) पढ़ना सुनिश्चित करें। \nयह पेज underlying क्लासेज के लिए API docs को शामिल करता है।\n\n## Tools\n\n### load_tool\n\n[[autodoc]] load_tool\n\n### tool\n\n[[autodoc]] tool\n\n### Tool\n\n[[autodoc]] Tool\n\n### launch_gradio_demo\n\n[[autodoc]] launch_gradio_demo\n\n## Default Tools\n\n### PythonInterpreterTool\n\n[[autodoc]] PythonInterpreterTool\n\n### DuckDuckGoSearchTool\n\n[[autodoc]] DuckDuckGoSearchTool\n\n### VisitWebpageTool\n\n[[autodoc]] VisitWebpageTool\n\n### UserInputTool\n\n[[autodoc]] UserInputTool\n\n## ToolCollection\n\n[[autodoc]] ToolCollection\n\n## Agent टाइप्स\n\nएजेंट्स टूल्स के बीच किसी भी प्रकार की ऑब्जेक्ट को संभाल सकते हैं; टूल्स, पूरी तरह से मल्टीमोडल होने के कारण, टेक्स्ट, इमेज, ऑडियो, वीडियो सहित अन्य प्रकारों को स्वीकार और रिटर्न कर सकते हैं। \nटूल्स के बीच अनुकूलता बढ़ाने के साथ-साथ इन रिटर्न्स को ipython (jupyter, colab, ipython notebooks, ...) में सही ढंग से रेंडर करने के लिए, हम इन टाइप्स के आसपास रैपर क्लासेज को लागू करते हैं।\n\nरैप किए गए ऑब्जेक्ट्स को प्रारंभ में जैसा व्यवहार करना चाहिए वैसा ही करना जारी रखना चाहिए; एक टेक्स्ट ऑब्जेक्ट को अभी भी स्ट्रिंग की तरह व्यवहार करना चाहिए|\nएक इमेज ऑब्जेक्ट को अभी भी `PIL.Image` की तरह व्यवहार करना चाहिए।\n\nइन टाइप्स के तीन विशिष्ट उद्देश्य हैं:\n\n- टाइप पर `to_raw` को कॉल करने से अंतर्निहित ऑब्जेक्ट रिटर्न होना चाहिए\n- टाइप पर `to_string` को कॉल करने से ऑब्जेक्ट को स्ट्रिंग के रूप में रिटर्न होना चाहिए: वह `AgentText` के मामले में स्ट्रिंग हो सकती है लेकिन अन्य उदाहरणों में ऑब्जेक्ट के सीरियलाइज्ड वर्जन का पाथ होगा\n- इसे एक ipython kernel में प्रदर्शित करने पर ऑब्जेक्ट को सही ढंग से प्रदर्शित करना चाहिए\n\n### AgentText\n\n[[autodoc]] smolagents.agent_types.AgentText\n\n### AgentImage\n\n[[autodoc]] smolagents.agent_types.AgentImage\n\n### AgentAudio\n\n[[autodoc]] smolagents.agent_types.AgentAudio\n"
  },
  {
    "path": "docs/source/hi/tutorials/building_good_agents.md",
    "content": "# अच्छे Agents का निर्माण\n\n[[open-in-colab]]\n\nएक ऐसा एजेंट बनाने में जो काम करता है और जो काम नहीं करता है, इसमें ज़मीन-आसमान का अंतर है।\nहम कैसे ऐसे एजेंट्स बना सकते हैं जो बाद वाली श्रेणी में आते हैं?\nइस गाइड में, हम एजेंट्स बनाने के लिए सर्वोत्तम प्रक्रियाएँ के बारे में बात करेंगे।\n\n> [!TIP]\n> यदि आप एजेंट्स बनाने में नए हैं, तो पहले [एजेंट्स का परिचय](../conceptual_guides/intro_agents) और [smolagents की गाइडेड टूर](../guided_tour) पढ़ना सुनिश्चित करें।\n\n### सर्वश्रेष्ठ एजेंटिक सिस्टम सबसे सरल होते हैं: वर्कफ़्लो को जितना हो सके उतना सरल बनाएं\n\nअपने वर्कफ़्लो में एक LLM को कुछ एजेंसी देने से त्रुटियों का जोखिम होता है।\n\nअच्छी तरह से प्रोग्राम किए गए एजेंटिक सिस्टम में वैसे भी अच्छी एरर लॉगिंग और रीट्राई मैकेनिज्म होते हैं, जिससे LLM इंजन अपनी गलतियों को सुधारने का मौका मिलता है। लेकिन LLM त्रुटि के जोखिम को अधिकतम कम करने के लिए, आपको अपना वर्कफ़्लो सरल बनाना चाहिए!\n\nआइए [एजेंट्स का परिचय](../conceptual_guides/intro_agents) से उदाहरण पर फिर से विचार करें: एक सर्फ ट्रिप कंपनी के लिए उपयोगकर्ता प्रश्नों का उत्तर देने वाला बॉट।\nएजेंट को हर बार जब एक नए सर्फ स्पॉट के बारे में पूछा जाता है तो \"travel distance API\" और \"weather API\" के लिए 2 अलग-अलग कॉल करने देने के बजाय, आप केवल एक एकीकृत टूल \"return_spot_information\" बना सकते हैं, एक फंक्शन जो दोनों APIs को एक साथ कॉल करता है और उनके संयोजित आउटपुट को उपयोगकर्ता को वापस करता है।\n\nयह लागत, देरी और त्रुटि जोखिम को कम करेगा!\n\nमुख्य दिशानिर्देश है: LLM कॉल्स की संख्या को जितना हो सके उतना कम करें।\n\nइससे कुछ निष्कर्ष निकलते हैं:\n- जब भी संभव हो, दो APIs के हमारे उदाहरण की तरह 2 टूल्स को एक में समूहित करें।\n- जब भी संभव हो, लॉजिक एजेंटिक निर्णयों के बजाय डिटरमिनिस्टिक फंक्शंस पर आधारित होनी चाहिए।\n\n### LLM इंजन को जानकारी के प्रवाह में सुधार करें\n\nयाद रखें कि आपका LLM इंजन एक *बुद्धिमान* रोबोट की तरह है, जो एक कमरे में बंद है, और बाहरी दुनिया के साथ इसका एकमात्र संचार दरवाजे के नीचे से नोट्स पास करना है।\n\nयह किसी भी ऐसी चीज के बारे में नहीं जानेगा जिसे आप स्पष्ट रूप से अपने प्रॉम्प्ट में नहीं डालते हैं।\n\nइसलिए पहले अपने कार्य को बहुत स्पष्ट बनाने से शुरू करें!\nचूंकि एक एजेंट LLM द्वारा संचालित होता है, आपके कार्य के निर्माण में छोटे बदलाव भी पूरी तरह से अलग परिणाम दे सकते हैं।\n\nफिर, टूल के उपयोग में अपने एजेंट की ओर जानकारी के प्रवाह में सुधार करें।\n\nपालन करने के लिए विशेष दिशानिर्देश:\n- प्रत्येक टूल को वह सब कुछ लॉग करना चाहिए (टूल की `forward` मेथड के अंदर केवल `print` स्टेटमेंट्स का उपयोग करके) जो LLM इंजन के लिए उपयोगी हो सकता है।\n  - विशेष रूप से, टूल एक्जीक्यूशन गलतियों पर विस्तृत लॉगिंग बहुत मदद करेगी!\n\nउदाहरण के लिए, यहाँ एक टूल है जो लोकेशन और डेट-टाइम के आधार पर मौसम डेटा प्राप्त करता है:\n\nपहले, यहाँ एक खराब रूप है:\n```python\nimport datetime\nfrom smolagents import tool\n\ndef get_weather_report_at_coordinates(coordinates, date_time):\n    # Dummy function, returns a list of [temperature in °C, risk of rain on a scale 0-1, wave height in m]\n    return [28.0, 0.35, 0.85]\n\ndef convert_location_to_coordinates(location):\n    # Returns dummy coordinates\n    return [3.3, -42.0]\n\n@tool\ndef get_weather_api(location: str, date_time: str) -> str:\n    \"\"\"\n    Returns the weather report.\n\n    Args:\n        location: the name of the place that you want the weather for.\n        date_time: the date and time for which you want the report.\n    \"\"\"\n    lon, lat = convert_location_to_coordinates(location)\n    date_time = datetime.strptime(date_time)\n    return str(get_weather_report_at_coordinates((lon, lat), date_time))\n```\n\n# यह खराब क्यों है?\n- `date_time` के लिए उपयोग किए जाने वाले फॉर्मेट की सटीकता का कोई उल्लेख नहीं है।  \n- यह स्पष्ट नहीं है कि स्थान (location) को किस प्रकार निर्दिष्ट किया जाना चाहिए।  \n- त्रुटियों को स्पष्ट रूप से इंगित करने के लिए कोई लॉगिंग मेकैनिज्म मौजूद नहीं है, जैसे कि स्थान गलत फॉर्मेट में होना या `date_time` का सही ढंग से फॉर्मेट न होना।  \n- आउटपुट फॉर्मेट समझने में कठिन है।  \n\nयदि टूल कॉल विफल हो जाती है, तो मेमोरी में लॉग की गई एरर ट्रेस LLM को टूल की समस्याओं को ठीक करने के लिए रिवर्स इंजीनियरिंग में मदद कर सकती है। लेकिन इतना सारा काम LLM को ही क्यों करने देना?\n\nइस टूल को बेहतर तरीके से बनाने का एक उदाहरण इस प्रकार हो सकता है:\n\n```python\n@tool\ndef get_weather_api(location: str, date_time: str) -> str:\n    \"\"\"\n    Returns the weather report.\n\n    Args:\n        location: the name of the place that you want the weather for. Should be a place name, followed by possibly a city name, then a country, like \"Anchor Point, Taghazout, Morocco\".\n        date_time: the date and time for which you want the report, formatted as '%m/%d/%y %H:%M:%S'.\n    \"\"\"\n    lon, lat = convert_location_to_coordinates(location)\n    try:\n        date_time = datetime.strptime(date_time)\n    except Exception as e:\n        raise ValueError(\"Conversion of `date_time` to datetime format failed, make sure to provide a string in format '%m/%d/%y %H:%M:%S'. Full trace:\" + str(e))\n    temperature_celsius, risk_of_rain, wave_height = get_weather_report_at_coordinates((lon, lat), date_time)\n    return f\"Weather report for {location}, {date_time}: Temperature will be {temperature_celsius}°C, risk of rain is {risk_of_rain*100:.0f}%, wave height is {wave_height}m.\"\n```\n\nसामान्य तौर पर, अपने LLM का बोझ को कम करने के लिए, खुद से यह अच्छा सवाल पूछें: \"यदि मैं नया और अनुभवहीन हूं और इस टूल का पहली बार उपयोग कर रहा हूं, तो इस टूल के साथ प्रोग्रामिंग करना और अपनी गलतियों को ठीक करना मेरे लिए कितना आसान होगा?\"\n\n### एजेंट को अधिक तर्क (arguments) दें\n\nअपने एजेंट को कार्य का वर्णन करने वाले साधारण स्ट्रिंग से आगे बढ़कर कुछ अतिरिक्त ऑब्जेक्ट्स देने के लिए, आप `additional_args` का उपयोग कर सकते हैं। यह आपको किसी भी प्रकार का ऑब्जेक्ट पास करने की सुविधा देता है:\n\n\n```py\nfrom smolagents import CodeAgent, InferenceClientModel\n\nmodel_id = \"meta-llama/Llama-3.3-70B-Instruct\"\n\nagent = CodeAgent(tools=[], model=InferenceClientModel(model_id=model_id), add_base_tools=True)\n\nagent.run(\n    \"Why does Mike not know many people in New York?\",\n    additional_args={\"mp3_sound_file_url\":'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/recording.mp3'}\n)\n```\nउदाहरण के लिए, आप इस `additional_args` आर्ग्यूमेंट का उपयोग उन इमेजेज़ या स्ट्रिंग्स को पास करने के लिए कर सकते हैं जिन्हें आप चाहते हैं कि आपका एजेंट उपयोग करे।\n\n\n\n## अपने एजेंट को डिबग कैसे करें\n\n### 1. एक अधिक शक्तिशाली LLM का उपयोग करें\n\nएजेंटिक वर्कफ़्लो में, कुछ त्रुटियां वास्तविक होती हैं, जबकि कुछ अन्य त्रुटियां आपके LLM इंजन के सही तरीके से तर्क न कर पाने की वजह से होती हैं।  \nउदाहरण के लिए, इस ट्रेस को देखें, जहां मैंने एक `CodeAgent` से एक कार की तस्वीर बनाने के लिए कहा:\n```\n==================================================================================================== New task ====================================================================================================\nMake me a cool car picture\n──────────────────────────────────────────────────────────────────────────────────────────────────── New step ────────────────────────────────────────────────────────────────────────────────────────────────────\nAgent is executing the code below: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\nimage_generator(prompt=\"A cool, futuristic sports car with LED headlights, aerodynamic design, and vibrant color, high-res, photorealistic\")\n──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\n\nLast output from code snippet: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\n/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png\nStep 1:\n\n- Time taken: 16.35 seconds\n- Input tokens: 1,383\n- Output tokens: 77\n──────────────────────────────────────────────────────────────────────────────────────────────────── New step ────────────────────────────────────────────────────────────────────────────────────────────────────\nAgent is executing the code below: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\nfinal_answer(\"/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png\")\n──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\nPrint outputs:\n\nLast output from code snippet: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\n/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png\nFinal answer:\n/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png\n```\nउपयोगकर्ता को, एक इमेज लौटाए जाने के बजाय, उन्हें एक पाथ लौटाया जाता है।\nयह सिस्टम से एक बग की तरह दिख सकता है, लेकिन वास्तव में एजेंटिक सिस्टम ने त्रुटि नहीं की: यह केवल इसलिए है कि LLM ब्रेन ने इमेज आउटपुट को एक वेरिएबल में सेव करने की गलती की।\nइस प्रकार यह इमेज को फिर से एक्सेस नहीं कर सकता है सिवाय इमेज को सेव करते समय लॉग किए गए पाथ का उपयोग करके, इसलिए यह इमेज के बजाय पाथ लौटाता है।\n\nअपने एजेंट को डीबग करने का पहला कदम इस प्रकार है \"एक अधिक शक्तिशाली LLM का उपयोग करें\"। `Qwen2/5-72B-Instruct` जैसे विकल्प वह गलती नहीं करते।\n\n### 2. अधिक मार्गदर्शन / अधिक जानकारी प्रदान करें\n\nआप कम शक्तिशाली मॉडल्स का भी उपयोग कर सकते हैं, बशर्ते आप उन्हें अधिक प्रभावी ढंग से मार्गदर्शन करें।\n\nअपने आप को अपने मॉडल की जगह रखें: यदि आप कार्य को हल करने वाला मॉडल होते, तो क्या आप उपलब्ध जानकारी (सिस्टम प्रॉम्प्ट + कार्य निर्माण + टूल विवरण से) के साथ संघर्ष करते?\n\nक्या आपको कुछ अतिरिक्त स्पष्टीकरण की आवश्यकता होती?\n\nअतिरिक्त जानकारी प्रदान करने के लिए, हम तुरंत सिस्टम प्रॉम्प्ट को बदलने की सलाह नहीं देते हैं: डिफ़ॉल्ट सिस्टम प्रॉम्प्ट में कई समायोजन हैं जिन्हें आप तब तक नहीं बिगाड़ना चाहते जब तक आप प्रॉम्प्ट को बहुत अच्छी तरह से नहीं समझते।\nअपने LLM इंजन को मार्गदर्शन करने के बेहतर तरीके हैं:\n- यदि यह कार्य को हल करने के बारे में है: इन सभी विवरणों को कार्य में जोड़ें। यह कार्य 100 पेज लंबा हो सकता है\n- यदि यह टूल्स के उपयोग के बारे में है: आपके टूल्स की विवरण विशेषता।\n\n### 3. सिस्टम प्रॉम्प्ट बदलें (आमतौर पर यह सलाह नहीं दी जाती)\n\nयदि उपरोक्त स्पष्टीकरण पर्याप्त नहीं हैं, तो आप सिस्टम प्रॉम्प्ट बदल सकते हैं।\n\nआइए देखें कि यह कैसे काम करता है। उदाहरण के लिए, आइए [`CodeAgent`] के लिए डिफ़ॉल्ट सिस्टम प्रॉम्प्ट की जाँच करें (नीचे दिया गया वर्जन जीरो-शॉट उदाहरणों को छोड़कर छोटा किया गया है)।\n\n```python\nprint(agent.prompt_templates[\"system_prompt\"])\n```\nHere is what you get:\n```text\nYou are an expert assistant who can solve any task using code blobs. You will be given a task to solve as best you can.\nTo do so, you have been given access to a list of tools: these tools are basically Python functions which you can call with code.\nTo solve the task, you must plan forward to proceed in a series of steps, in a cycle of 'Thought:', 'Code:', and 'Observation:' sequences.\n\nAt each step, in the 'Thought:' sequence, you should first explain your reasoning towards solving the task and the tools that you want to use.\nThen in the 'Code:' sequence, you should write the code in simple Python. The code sequence must end with '<end_code>' sequence.\nDuring each intermediate step, you can use 'print()' to save whatever important information you will then need.\nThese print outputs will then appear in the 'Observation:' field, which will be available as input for the next step.\nIn the end you have to return a final answer using the `final_answer` tool.\n\nHere are a few examples using notional tools:\n---\nTask: \"Generate an image of the oldest person in this document.\"\n\nThought: I will proceed step by step and use the following tools: `document_qa` to find the oldest person in the document, then `image_generator` to generate an image according to the answer.\nCode:\n```py\nanswer = document_qa(document=document, question=\"Who is the oldest person mentioned?\")\nprint(answer)\n```<end_code>\nObservation: \"The oldest person in the document is John Doe, a 55 year old lumberjack living in Newfoundland.\"\n\nThought: I will now generate an image showcasing the oldest person.\nCode:\n```py\nimage = image_generator(\"A portrait of John Doe, a 55-year-old man living in Canada.\")\nfinal_answer(image)\n```<end_code>\n\n---\nTask: \"What is the result of the following operation: 5 + 3 + 1294.678?\"\n\nThought: I will use python code to compute the result of the operation and then return the final answer using the `final_answer` tool\nCode:\n```py\nresult = 5 + 3 + 1294.678\nfinal_answer(result)\n```<end_code>\n\n---\nTask:\n\"Answer the question in the variable `question` about the image stored in the variable `image`. The question is in French.\nYou have been provided with these additional arguments, that you can access using the keys as variables in your python code:\n{'question': 'Quel est l'animal sur l'image?', 'image': 'path/to/image.jpg'}\"\n\nThought: I will use the following tools: `translator` to translate the question into English and then `image_qa` to answer the question on the input image.\nCode:\n```py\ntranslated_question = translator(question=question, src_lang=\"French\", tgt_lang=\"English\")\nprint(f\"The translated question is {translated_question}.\")\nanswer = image_qa(image=image, question=translated_question)\nfinal_answer(f\"The answer is {answer}\")\n```<end_code>\n\n---\nTask:\nIn a 1979 interview, Stanislaus Ulam discusses with Martin Sherwin about other great physicists of his time, including Oppenheimer.\nWhat does he say was the consequence of Einstein learning too much math on his creativity, in one word?\n\nThought: I need to find and read the 1979 interview of Stanislaus Ulam with Martin Sherwin.\nCode:\n```py\npages = search(query=\"1979 interview Stanislaus Ulam Martin Sherwin physicists Einstein\")\nprint(pages)\n```<end_code>\nObservation:\nNo result found for query \"1979 interview Stanislaus Ulam Martin Sherwin physicists Einstein\".\n\nThought: The query was maybe too restrictive and did not find any results. Let's try again with a broader query.\nCode:\n```py\npages = search(query=\"1979 interview Stanislaus Ulam\")\nprint(pages)\n```<end_code>\nObservation:\nFound 6 pages:\n[Stanislaus Ulam 1979 interview](https://ahf.nuclearmuseum.org/voices/oral-histories/stanislaus-ulams-interview-1979/)\n\n[Ulam discusses Manhattan Project](https://ahf.nuclearmuseum.org/manhattan-project/ulam-manhattan-project/)\n\n(truncated)\n\nThought: I will read the first 2 pages to know more.\nCode:\n```py\nfor url in [\"https://ahf.nuclearmuseum.org/voices/oral-histories/stanislaus-ulams-interview-1979/\", \"https://ahf.nuclearmuseum.org/manhattan-project/ulam-manhattan-project/\"]:\n    whole_page = visit_webpage(url)\n    print(whole_page)\n    print(\"\\n\" + \"=\"*80 + \"\\n\")  # Print separator between pages\n```<end_code>\nObservation:\nManhattan Project Locations:\nLos Alamos, NM\nStanislaus Ulam was a Polish-American mathematician. He worked on the Manhattan Project at Los Alamos and later helped design the hydrogen bomb. In this interview, he discusses his work at\n(truncated)\n\nThought: I now have the final answer: from the webpages visited, Stanislaus Ulam says of Einstein: \"He learned too much mathematics and sort of diminished, it seems to me personally, it seems to me his purely physics creativity.\" Let's answer in one word.\nCode:\n```py\nfinal_answer(\"diminished\")\n```<end_code>\n\n---\nTask: \"Which city has the highest population: Guangzhou or Shanghai?\"\n\nThought: I need to get the populations for both cities and compare them: I will use the tool `search` to get the population of both cities.\nCode:\n```py\nfor city in [\"Guangzhou\", \"Shanghai\"]:\n    print(f\"Population {city}:\", search(f\"{city} population\")\n```<end_code>\nObservation:\nPopulation Guangzhou: ['Guangzhou has a population of 15 million inhabitants as of 2021.']\nPopulation Shanghai: '26 million (2019)'\n\nThought: Now I know that Shanghai has the highest population.\nCode:\n```py\nfinal_answer(\"Shanghai\")\n```<end_code>\n\n---\nTask: \"What is the current age of the pope, raised to the power 0.36?\"\n\nThought: I will use the tool `wiki` to get the age of the pope, and confirm that with a web search.\nCode:\n```py\npope_age_wiki = wiki(query=\"current pope age\")\nprint(\"Pope age as per wikipedia:\", pope_age_wiki)\npope_age_search = web_search(query=\"current pope age\")\nprint(\"Pope age as per google search:\", pope_age_search)\n```<end_code>\nObservation:\nPope age: \"The pope Francis is currently 88 years old.\"\n\nThought: I know that the pope is 88 years old. Let's compute the result using python code.\nCode:\n```py\npope_current_age = 88 ** 0.36\nfinal_answer(pope_current_age)\n```<end_code>\n\nAbove example were using notional tools that might not exist for you. On top of performing computations in the Python code snippets that you create, you only have access to these tools:\n{%- for tool in tools.values() %}\n- {{ tool.to_tool_calling_prompt() }}\n{%- endfor %}\n\n{%- if managed_agents and managed_agents.values() | list %}\nYou can also give tasks to team members.\nCalling a team member works similarly to calling a tool: provide the task description as the 'task' argument. Since this team member is a real human, be as detailed and verbose as necessary in your task description.\nYou can also include any relevant variables or context using the 'additional_args' argument.\nHere is a list of the team members that you can call:\n{%- for agent in managed_agents.values() %}\n- {{ agent.name }}: {{ agent.description }}\n{%- endfor %}\n{%- endif %}\n\nHere are the rules you should always follow to solve your task:\n1. Always provide a 'Thought:' sequence, and a 'Code:\\n```py' sequence ending with '```<end_code>' sequence, else you will fail.\n2. Use only variables that you have defined!\n3. Always use the right arguments for the tools. DO NOT pass the arguments as a dict as in 'answer = wiki({'query': \"What is the place where James Bond lives?\"})', but use the arguments directly as in 'answer = wiki(query=\"What is the place where James Bond lives?\")'.\n4. Take care to not chain too many sequential tool calls in the same code block, especially when the output format is unpredictable. For instance, a call to search has an unpredictable return format, so do not have another tool call that depends on its output in the same block: rather output results with print() to use them in the next block.\n5. Call a tool only when needed, and never re-do a tool call that you previously did with the exact same parameters.\n6. Don't name any new variable with the same name as a tool: for instance don't name a variable 'final_answer'.\n7. Never create any notional variables in our code, as having these in your logs will derail you from the true variables.\n8. You can use imports in your code, but only from the following list of modules: {{authorized_imports}}\n9. The state persists between code executions: so if in one step you've created variables or imported modules, these will all persist.\n10. Don't give up! You're in charge of solving the task, not providing directions to solve it.\n\nNow Begin! If you solve the task correctly, you will receive a reward of $1,000,000.\n```\n\nजैसा कि आप देख सकते हैं, `\"{{ tool.description }}\"` जैसे प्लेसहोल्डर्स हैं: इनका उपयोग एजेंट इनिशियलाइजेशन के समय टूल्स या मैनेज्ड एजेंट्स के कुछ स्वचालित रूप से जनरेट किए गए विवरणों को डालने के लिए किया जाएगा।\n\nइसलिए जबकि आप `system_prompt` पैरामीटर में अपने कस्टम प्रॉम्प्ट को आर्गुमेंट के रूप में पास करके इस सिस्टम प्रॉम्प्ट टेम्पलेट को ओवरराइट कर सकते हैं, आपके नए सिस्टम प्रॉम्प्ट में निम्नलिखित प्लेसहोल्डर्स होने चाहिए:\n- टूल विवरण डालने के लिए।\n  ```\n  {%- for tool in tools.values() %}\n  - {{ tool.to_tool_calling_prompt() }}\n  {%- endfor %}\n  ```\n- यदि कोई मैनेज्ड एजेंट्स हैं तो उनके लिए विवरण डालने के लिए।\n  ```\n  {%- if managed_agents and managed_agents.values() | list %}\n  You can also give tasks to team members.\n  Calling a team member works similarly to calling a tool: provide the task description as the 'task' argument. Since this team member is a real human, be as detailed and verbose as necessary in your task description.\n  You can also include any relevant variables or context using the 'additional_args' argument.\n  Here is a list of the team members that you can call:\n  {%- for agent in managed_agents.values() %}\n  - {{ agent.name }}: {{ agent.description }}\n  {%- endfor %}\n  {%- endif %}\n  ```\n- केवल `CodeAgent` के लिए: अधिकृत इम्पोर्ट्स की सूची डालने के लिए `\"{{authorized_imports}}\"`।\n\nफिर आप सिस्टम प्रॉम्प्ट को निम्नानुसार बदल सकते हैं:\n\n```py\nagent.prompt_templates[\"system_prompt\"] = agent.prompt_templates[\"system_prompt\"] + \"\\nHere you go!\"\n```\n\nThis also works with the [`ToolCallingAgent`].\n\n\n### 4. अतिरिक्त योजना\n\nहम पूरक योजना चरण के लिए एक मॉडल प्रदान करते हैं, जिसे एजेंट सामान्य क्रियाओं के चरणों के बीच नियमित रूप से चला सकता है। इस चरण में कोई टूल कॉल नहीं होती है, LLM से केवल उन तथ्यों की सूची को अपडेट करने के लिए कहा जाता है जो उसे ज्ञात हैं और इन तथ्यों के आधार पर उसे अगले कदमों के बारे में विचार करना होता है।\n\n```py\nfrom smolagents import load_tool, CodeAgent, InferenceClientModel, WebSearchTool\nfrom dotenv import load_dotenv\n\nload_dotenv()\n\n# Import tool from Hub\nimage_generation_tool = load_tool(\"m-ric/text-to-image\", trust_remote_code=True)\n\nsearch_tool = WebSearchTool()\n\nagent = CodeAgent(\n    tools=[search_tool],\n    model=InferenceClientModel(model_id=\"Qwen/Qwen2.5-72B-Instruct\"),\n    planning_interval=3 # This is where you activate planning!\n)\n\n# Run it!\nresult = agent.run(\n    \"How long would a cheetah at full speed take to run the length of Pont Alexandre III?\",\n)\n```\n"
  },
  {
    "path": "docs/source/hi/tutorials/inspect_runs.md",
    "content": "# OpenTelemetry के साथ runs का निरीक्षण\n\n[[open-in-colab]]\n\n> [!TIP]\n> यदि आप एजेंट्स बनाने में नए हैं, तो पहले [एजेंट्स का परिचय](../conceptual_guides/intro_agents) और [smolagents की गाइडेड टूर](../guided_tour) पढ़ना सुनिश्चित करें।\n\n### Agents runs को लॉग क्यों करें?\n\nAgent runs को डीबग करना जटिल होता है।\n\nयह सत्यापित करना कठिन है कि एक रन ठीक से चला या नहीं, क्योंकि एजेंट वर्कफ़्लो [डिज़ाइन के अनुसार अप्रत्याशित](../conceptual_guides/intro_agents) होते हैं (यदि वे प्रत्याशित होते, तो आप पुराने अच्छे कोड का ही उपयोग कर रहे होते)।\n\nऔर रन का निरीक्षण करना भी कठिन है: मल्टी-स्टेप एजेंट्स जल्दी ही कंसोल को लॉग से भर देते हैं, और अधिकांश त्रुटियां केवल \"LLM dumb\" प्रकार की त्रुटियां होती हैं, जिनसे LLM अगले चरण में बेहतर कोड या टूल कॉल लिखकर स्वयं को सुधार लेता है।\n\nइसलिए बाद के निरीक्षण और मॉनिटरिंग के लिए प्रोडक्शन में agent runs को रिकॉर्ड करने के लिए इंस्ट्रुमेंटेशन का उपयोग करना आवश्यक है!\n\nहमने agent runs को इंस्ट्रुमेंट करने के लिए [OpenTelemetry](https://opentelemetry.io/) मानक को अपनाया है।\n\nइसका मतलब है कि आप बस कुछ इंस्ट्रुमेंटेशन कोड चला सकते हैं, फिर अपने एजेंट्स को सामान्य रूप से चला सकते हैं, और सब कुछ आपके प्लेटफॉर्म में लॉग हो जाता है।\n\nयह इस प्रकार होता है:\nपहले आवश्यक पैकेज इंस्टॉल करें। यहां हम [Phoenix by Arize AI](https://github.com/Arize-ai/phoenix) इंस्टॉल करते हैं क्योंकि यह लॉग्स को एकत्र और निरीक्षण करने का एक अच्छा समाधान है, लेकिन इस संग्रह और निरीक्षण भाग के लिए आप अन्य OpenTelemetry-कम्पैटिबल प्लेटफॉर्म्स का उपयोग कर सकते हैं।\n\n```shell\npip install smolagents\npip install arize-phoenix opentelemetry-sdk opentelemetry-exporter-otlp openinference-instrumentation-smolagents\n```\n\nफिर कलेक्टर को बैकग्राउंड में चलाएं।\n\n```shell\npython -m phoenix.server.main serve\n```\n\nअंत में, अपने एजेंट्स को ट्रेस करने और ट्रेस को नीचे परिभाषित एंडपॉइंट पर Phoenix को भेजने के लिए `SmolagentsInstrumentor` को सेट करें।\n\n```python\nfrom opentelemetry import trace\nfrom opentelemetry.sdk.trace import TracerProvider\nfrom opentelemetry.sdk.trace.export import BatchSpanProcessor\n\nfrom openinference.instrumentation.smolagents import SmolagentsInstrumentor\nfrom opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter\nfrom opentelemetry.sdk.trace.export import ConsoleSpanExporter, SimpleSpanProcessor\n\nendpoint = \"http://0.0.0.0:6006/v1/traces\"\ntrace_provider = TracerProvider()\ntrace_provider.add_span_processor(SimpleSpanProcessor(OTLPSpanExporter(endpoint)))\n\nSmolagentsInstrumentor().instrument(tracer_provider=trace_provider)\n```\nतब आप अपने एजेंट चला सकते हैं!\n\n```py\nfrom smolagents import (\n    CodeAgent,\n    ToolCallingAgent,\n    WebSearchTool,\n    VisitWebpageTool,\n    InferenceClientModel,\n)\n\nmodel = InferenceClientModel()\n\nmanaged_agent = ToolCallingAgent(\n    tools=[WebSearchTool(), VisitWebpageTool()],\n    model=model,\n    name=\"managed_agent\",\n    description=\"This is an agent that can do web search.\",\n)\n\nmanager_agent = CodeAgent(\n    tools=[],\n    model=model,\n    managed_agents=[managed_agent],\n)\nmanager_agent.run(\n    \"If the US keeps its 2024 growth rate, how many years will it take for the GDP to double?\"\n)\n```\nऔर फिर आप अपने रन का निरीक्षण करने के लिए `http://0.0.0.0:6006/projects/` पर जा सकते हैं!\n\n<img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/inspect_run_phoenix.png\">\n\nआप देख सकते हैं कि CodeAgent ने अपने मैनेज्ड ToolCallingAgent को (वैसे, मैनेज्ड एजेंट एक CodeAgent भी हो सकता था) U.S. 2024 ग्रोथ रेट के लिए वेब सर्च चलाने के लिए कॉल किया। फिर मैनेज्ड एजेंट ने अपनी रिपोर्ट लौटाई और मैनेजर एजेंट ने अर्थव्यवस्था के दोगुना होने का समय गणना करने के लिए उस पर कार्य किया! अच्छा है, है ना?"
  },
  {
    "path": "docs/source/hi/tutorials/secure_code_execution.md",
    "content": "# सुरक्षित कोड एक्जीक्यूशन\n\n[[open-in-colab]]\n\n> [!TIP]\n> यदि आप एजेंट्स बनाने में नए हैं, तो सबसे पहले [एजेंट्स का परिचय](../conceptual_guides/intro_agents) और [smolagents की गाइडेड टूर](../guided_tour) पढ़ना सुनिश्चित करें।\n\n### कोड Agents\n\n[कई](https://huggingface.co/papers/2402.01030) [शोध](https://huggingface.co/papers/2411.01747) [पत्रों](https://huggingface.co/papers/2401.00812) ने दिखाया है कि LLM द्वारा अपनी क्रियाओं (टूल कॉल्स) को कोड में लिखना, टूल कॉलिंग के वर्तमान मानक प्रारूप से बहुत बेहतर है, जो industry में \"टूल्स नेम्स और आर्ग्यूमेंट्स को JSON के रूप में लिखने\" के विभिन्न रूप हैं।\n\nकोड बेहतर क्यों है? क्योंकि हमने अपनी कोड भाषाओं को विशेष रूप से कंप्यूटर द्वारा की जाने वाली क्रियाओं को व्यक्त करने के लिए तैयार किया है। यदि JSON स्निपेट्स एक बेहतर तरीका होता, तो यह पैकेज JSON स्निपेट्स में लिखा गया होता और शैतान हम पर हंस रहा होता।\n\nकोड कंप्यूटर पर क्रियाएँ व्यक्त करने का बेहतर तरीका है। इसमें बेहतर है:\n- **कंपोज़ेबिलिटी:** क्या आप JSON क्रियाओं को एक-दूसरे के भीतर नेस्ट कर सकते हैं, या बाद में पुन: उपयोग करने के लिए JSON क्रियाओं का एक सेट परिभाषित कर सकते हैं, जैसे आप बस एक पायथन फ़ंक्शन परिभाषित कर सकते हैं?\n- **ऑब्जेक्ट प्रबंधन:** JSON में `generate_image` जैसी क्रिया का आउटपुट कैसे स्टोर करें?\n- **सामान्यता:** कोड किसी भी कंप्यूटर कार्य को व्यक्त करने के लिए बनाया गया है।\n- **LLM प्रशिक्षण कॉर्पस में प्रतिनिधित्व:** क्यों न इस आशीर्वाद का लाभ उठाएं कि उच्च गुणवत्ता वाले कोड उदाहरण पहले से ही LLM प्रशिक्षण डेटा में शामिल हैं?\n\nयह नीचे दी गई छवि में दर्शाया गया है, जो [Executable Code Actions Elicit Better LLM Agents](https://huggingface.co/papers/2402.01030) से ली गई है।\n\n<img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/code_vs_json_actions.png\">\n\nयही कारण है कि हमने कोड एजेंट्स, इस मामले में पायथन एजेंट्स पर जोर दिया, जिसका मतलब सुरक्षित पायथन इंटरप्रेटर बनाने पर अधिक प्रयास करना था।\n\n### लोकल पायथन इंटरप्रेटर\n\nडिफ़ॉल्ट रूप से, `CodeAgent` LLM-जनरेटेड कोड को आपके एनवायरनमेंट में चलाता है।\nयह एक्जीक्यूशन वैनिला पायथन इंटरप्रेटर द्वारा नहीं किया जाता: हमने एक अधिक सुरक्षित `LocalPythonExecutor` को शुरू से फिर से बनाया है।\nयह इंटरप्रेटर सुरक्षा के लिए डिज़ाइन किया गया है:\n - इम्पोर्ट्स को उपयोगकर्ता द्वारा स्पष्ट रूप से पास की गई सूची तक सीमित करना\n - इनफिनिट लूप्स और रिसोर्स ब्लोटिंग को रोकने के लिए ऑपरेशंस की संख्या को कैप करना\n - कोई भी ऐसा ऑपरेशन नहीं करेगा जो पूर्व-परिभाषित नहीं है\n\nहमने इसे कई उपयोग मामलों में इस्तेमाल किया है, और कभी भी एनवायरनमेंट को कोई नुकसान नहीं देखा। \n\nहालांकि यह समाधान पूरी तरह से सुरक्षित नहीं है: कोई ऐसे अवसरों की कल्पना कर सकता है जहां दुर्भावनापूर्ण कार्यों के लिए फाइन-ट्यून किए गए LLM अभी भी आपके एनवायरनमेंट को नुकसान पहुंचा सकते हैं। उदाहरण के लिए यदि आपने छवियों को प्रोसेस करने के लिए `Pillow` जैसे मासूम पैकेज की अनुमति दी है, तो LLM आपकी हार्ड ड्राइव को ब्लोट करने के लिए हजारों छवियों को सेव कर सकता है।\nयदि आपने खुद LLM इंजन चुना है तो यह निश्चित रूप से संभावित नहीं है, लेकिन यह हो सकता है।\n\nतो यदि आप अतिरिक्त सावधानी बरतना चाहते हैं, तो आप नीचे वर्णित रिमोट कोड एक्जीक्यूशन विकल्प का उपयोग कर सकते हैं।\n\n### E2B कोड एक्जीक्यूटर\n\nअधिकतम सुरक्षा के लिए, आप कोड को सैंडबॉक्स्ड एनवायरनमेंट में चलाने के लिए E2B के साथ हमारे एकीकरण का उपयोग कर सकते हैं। यह एक रिमोट एक्जीक्यूशन सेवा है जो आपके कोड को एक आइसोलेटेड कंटेनर में चलाती है, जिससे कोड का आपके स्थानीय एनवायरनमेंट को प्रभावित करना असंभव हो जाता है।\n\nइसके लिए, आपको अपना E2B अकाउंट सेटअप करने और अपने एनवायरनमेंट वेरिएबल्स में अपना `E2B_API_KEY` सेट करने की आवश्यकता होगी। अधिक जानकारी के लिए [E2B की क्विकस्टार्ट डॉक्यूमेंटेशन](https://e2b.dev/docs/quickstart) पर जाएं।\n\nफिर आप इसे `pip install e2b-code-interpreter python-dotenv` के साथ इंस्टॉल कर सकते हैं।\n\nअब आप तैयार हैं!\n\nकोड एक्जीक्यूटर को E2B पर सेट करने के लिए, बस अपने `CodeAgent` को इनिशियलाइज़ करते समय `executor_type=\"e2b\"` फ्लैग पास करें।\nध्यान दें कि आपको `additional_authorized_imports` में सभी टूल की डिपेंडेंसीज़ जोड़नी चाहिए, ताकि एक्जीक्यूटर उन्हें इंस्टॉल करे।\n\n```py\nfrom smolagents import CodeAgent, VisitWebpageTool, InferenceClientModel\nagent = CodeAgent(\n    tools = [VisitWebpageTool()],\n    model=InferenceClientModel(),\n    additional_authorized_imports=[\"requests\", \"markdownify\"],\n    executor_type=\"e2b\"\n)\n\nagent.run(\"What was Abraham Lincoln's preferred pet?\")\n```\n\nE2B कोड एक्जीक्यूशन वर्तमान में मल्टी-एजेंट्स के साथ काम नहीं करता है - क्योंकि कोड ब्लॉब में एक एजेंट कॉल करना जो रिमोटली एक्जीक्यूट किया जाना चाहिए, यह एक गड़बड़ है। लेकिन हम इसे जोड़ने पर काम कर रहे हैं!\n"
  },
  {
    "path": "docs/source/hi/tutorials/tools.md",
    "content": "# Tools\n\n[[open-in-colab]]\n\nयहाँ, हम एडवांस्ड tools उपयोग देखेंगे।\n\n> [!TIP]\n> यदि आप एजेंट्स बनाने में नए हैं, तो सबसे पहले [एजेंट्स का परिचय](../conceptual_guides/intro_agents) और [smolagents की गाइडेड टूर](../guided_tour) पढ़ना सुनिश्चित करें।\n\n- [Tools](#tools)\n    - [टूल क्या है, और इसे कैसे बनाएं?](#टूल-क्या-है-और-इसे-कैसे-बनाएं)\n    - [अपना टूल हब पर शेयर करें](#अपना-टूल-हब-पर-शेयर-करें)\n    - [स्पेस को टूल के रूप में इम्पोर्ट करें](#स्पेस-को-टूल-के-रूप-में-इम्पोर्ट-करें)\n    - [LangChain टूल्स का उपयोग करें](#LangChain-टूल्स-का-उपयोग-करें)\n    - [अपने एजेंट के टूलबॉक्स को मैनेज करें](#अपने-एजेंट-के-टूलबॉक्स-को-मैनेज-करें)\n    - [टूल्स का कलेक्शन उपयोग करें](#टूल्स-का-कलेक्शन-उपयोग-करें)\n\n### टूल क्या है और इसे कैसे बनाएं\n\nटूल मुख्य रूप से एक फ़ंक्शन है जिसे एक LLM एजेंटिक सिस्टम में उपयोग कर सकता है।\n\nलेकिन इसका उपयोग करने के लिए, LLM को एक API दी जाएगी: नाम, टूल विवरण, इनपुट प्रकार और विवरण, आउटपुट प्रकार।\n\nइसलिए यह केवल एक फ़ंक्शन नहीं हो सकता। यह एक क्लास होनी चाहिए।\n\nतो मूल रूप से, टूल एक क्लास है जो एक फ़ंक्शन को मेटाडेटा के साथ रैप करती है जो LLM को समझने में मदद करती है कि इसका उपयोग कैसे करें।\n\nयह कैसा दिखता है:\n\n```python\nfrom smolagents import Tool\n\nclass HFModelDownloadsTool(Tool):\n    name = \"model_download_counter\"\n    description = \"\"\"\n    This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub.\n    It returns the name of the checkpoint.\"\"\"\n    inputs = {\n        \"task\": {\n            \"type\": \"string\",\n            \"description\": \"the task category (such as text-classification, depth-estimation, etc)\",\n        }\n    }\n    output_type = \"string\"\n\n    def forward(self, task: str):\n        from huggingface_hub import list_models\n\n        model = next(iter(list_models(filter=task, sort=\"downloads\", direction=-1)))\n        return model.id\n\nmodel_downloads_tool = HFModelDownloadsTool()\n```\n\nकस्टम टूल `Tool` को सबक्लास करता है उपयोगी मेथड्स को इनहेरिट करने के लिए। चाइल्ड क्लास भी परिभाषित करती है:\n- एक `name` एट्रिब्यूट, जो टूल के नाम से संबंधित है। नाम आमतौर पर बताता है कि टूल क्या करता है। चूंकि कोड एक टास्क के लिए सबसे अधिक डाउनलोड वाले मॉडल को रिटर्न करता है, इसलिए इसे `model_download_counter` नाम दें।\n- एक `description` एट्रिब्यूट एजेंट के सिस्टम प्रॉम्प्ट को पॉपुलेट करने के लिए उपयोग किया जाता है।\n- एक `inputs` एट्रिब्यूट, जो `\"type\"` और `\"description\"` keys वाला डिक्शनरी है। इसमें जानकारी होती है जो पायथन इंटरप्रेटर को इनपुट के बारे में शिक्षित विकल्प चुनने में मदद करती है।\n- एक `output_type` एट्रिब्यूट, जो आउटपुट टाइप को निर्दिष्ट करता है। `inputs` और `output_type` दोनों के लिए टाइप [Pydantic formats](https://docs.pydantic.dev/latest/concepts/json_schema/#generating-json-schema) होने चाहिए, वे इनमें से कोई भी हो सकते हैं: `[\"string\", \"boolean\",\"integer\", \"number\", \"image\", \"audio\", \"array\", \"object\", \"any\", \"null\"]`।\n- एक `forward` मेथड जिसमें एक्जीक्यूट किया जाने वाला इन्फरेंस कोड होता है।\n\nएजेंट में उपयोग किए जाने के लिए इतना ही चाहिए!\n\nटूल बनाने का एक और तरीका है। [guided_tour](../guided_tour) में, हमने `@tool` डेकोरेटर का उपयोग करके एक टूल को लागू किया। [`tool`] डेकोरेटर सरल टूल्स को परिभाषित करने का अनुशंसित तरीका है, लेकिन कभी-कभी आपको इससे अधिक की आवश्यकता होती है: अधिक स्पष्टता के लिए एक क्लास में कई मेथड्स का उपयोग करना, या अतिरिक्त क्लास एट्रिब्यूट्स का उपयोग करना।\n\nइस स्थिति में, आप ऊपर बताए अनुसार [`Tool`] को सबक्लास करके अपना टूल बना सकते हैं।\n\n### अपना टूल हब पर शेयर करें\n\nआप टूल पर [`~Tool.push_to_hub`] को कॉल करके अपना कस्टम टूल हब पर शेयर कर सकते हैं। सुनिश्चित करें कि आपने हब पर इसके लिए एक रिपॉजिटरी बनाई है और आप रीड एक्सेस वाला टोकन उपयोग कर रहे हैं।\n\n```python\nmodel_downloads_tool.push_to_hub(\"{your_username}/hf-model-downloads\", token=\"<YOUR_HUGGINGFACEHUB_API_TOKEN>\")\n```\n\nहब पर पुश करने के लिए काम करने के लिए, आपके टूल को कुछ नियमों का पालन करना होगा:\n- सभी मेथड्स सेल्फ-कंटेन्ड हैं, यानी उनके आर्ग्स से आने वाले वेरिएबल्स का उपयोग करें।\n- उपरोक्त बिंदु के अनुसार, **सभी इम्पोर्ट्स को सीधे टूल के फ़ंक्शंस के भीतर परिभाषित किया जाना चाहिए**, अन्यथा आपको अपने कस्टम टूल के साथ [`~Tool.save`] या [`~Tool.push_to_hub`] को कॉल करने का प्रयास करते समय एरर मिलेगा।\n- यदि आप `__init__` विधि को सबक्लास करते हैं, तो आप इसे `self` के अलावा कोई अन्य आर्ग्यूमेंट नहीं दे सकते। ऐसा इसलिए है क्योंकि किसी विशिष्ट टूल इंस्टेंस के इनिशियलाइजेशन के दौरान सेट किए गए तर्कों को आर्ग्यूमेंट्स करना कठिन होता है, जो उन्हें हब पर ठीक से साझा करने से रोकता है। और वैसे भी, एक विशिष्ट क्लास बनाने का विचार यह है कि आप हार्ड-कोड के लिए आवश्यक किसी भी चीज़ के लिए क्लास विशेषताएँ पहले से ही सेट कर सकते हैं (बस `your_variable=(...)` को सीधे `class YourTool(Tool):` पंक्ति के अंतर्गत सेट करें ). और निश्चित रूप से आप अभी भी `self.your_variable` को असाइन करके अपने कोड में कहीं भी एक क्लास विशेषता बना सकते हैं।\n\n\nएक बार जब आपका टूल हब पर पुश हो जाता है, तो आप इसे विज़ुअलाइज़ कर सकते हैं। [यहाँ](https://huggingface.co/spaces/m-ric/hf-model-downloads) `model_downloads_tool` है जिसे मैंने पुश किया है। इसमें एक अच्छा ग्रेडियो इंटरफ़ेस है।\n\nटूल फ़ाइलों में गहराई से जाने पर, आप पा सकते हैं कि सारी टूल लॉजिक [tool.py](https://huggingface.co/spaces/m-ric/hf-model-downloads/blob/main/tool.py) के अंतर्गत है। यहीं आप किसी और द्वारा शेयर किए गए टूल का निरीक्षण कर सकते हैं।\n\nफिर आप टूल को [`load_tool`] के साथ लोड कर सकते हैं या [`~Tool.from_hub`] के साथ बना सकते हैं और इसे अपने एजेंट में `tools` पैरामीटर में पास कर सकते हैं।\nचूंकि टूल्स को चलाने का मतलब कस्टम कोड चलाना है, आपको यह सुनिश्चित करना होगा कि आप रिपॉजिटरी पर भरोसा करते हैं, इसलिए हम हब से टूल लोड करने के लिए `trust_remote_code=True` पास करने की आवश्यकता रखते हैं।\n\n```python\nfrom smolagents import load_tool, CodeAgent\n\nmodel_download_tool = load_tool(\n    \"{your_username}/hf-model-downloads\",\n    trust_remote_code=True\n)\n```\n\n### स्पेस को टूल के रूप में इम्पोर्ट करें\n\nआप [`Tool.from_space`] मेथड का उपयोग करके हब से एक स्पेस को सीधे टूल के रूप में इम्पोर्ट कर सकते हैं!\n\nआपको केवल हब पर स्पेस की ID, इसका नाम, और एक विवरण प्रदान करने की आवश्यकता है जो आपके एजेंट को समझने में मदद करेगा कि टूल क्या करता है। अंदर से, यह स्पेस को कॉल करने के लिए [`gradio-client`](https://pypi.org/project/gradio-client/) लाइब्रेरी का उपयोग करेगा।\n\nउदाहरण के लिए, चलिए हब से [FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) स्पेस को इम्पोर्ट करें और इसका उपयोग एक इमेज जनरेट करने के लिए करें।\n\n```python\nimage_generation_tool = Tool.from_space(\n    \"black-forest-labs/FLUX.1-schnell\",\n    name=\"image_generator\",\n    description=\"Generate an image from a prompt\"\n)\n\nimage_generation_tool(\"A sunny beach\")\n```\nऔर देखो, यह तुम्हारी छवि है! 🏖️\n\n<img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/sunny_beach.webp\">\n\nफिर आप इस टूल का उपयोग किसी अन्य टूल की तरह कर सकते हैं। उदाहरण के लिए, चलिए प्रॉम्प्ट `a rabbit wearing a space suit` को सुधारें और इसकी एक इमेज जनरेट करें। यह उदाहरण यह भी दिखाता है कि आप एजेंट को अतिरिक्त आर्ग्यूमेंट्स कैसे पास कर सकते हैं।\n\n```python\nfrom smolagents import CodeAgent, InferenceClientModel\n\nmodel = InferenceClientModel(model_id=\"Qwen/Qwen3-Next-80B-A3B-Thinking\")\nagent = CodeAgent(tools=[image_generation_tool], model=model)\n\nagent.run(\n    \"Improve this prompt, then generate an image of it.\", additional_args={'user_prompt': 'A rabbit wearing a space suit'}\n)\n```\n\n```text\n=== Agent thoughts:\nimproved_prompt could be \"A bright blue space suit wearing rabbit, on the surface of the moon, under a bright orange sunset, with the Earth visible in the background\"\n\nNow that I have improved the prompt, I can use the image generator tool to generate an image based on this prompt.\n>>> Agent is executing the code below:\nimage = image_generator(prompt=\"A bright blue space suit wearing rabbit, on the surface of the moon, under a bright orange sunset, with the Earth visible in the background\")\nfinal_answer(image)\n```\n\n<img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit_spacesuit_flux.webp\">\n\nयह कितना कूल है? 🤩\n\n### LangChain टूल्स का उपयोग करें\n\nहम LangChain को पसंद करते हैं और मानते हैं कि इसके पास टूल्स का एक बहुत आकर्षक संग्रह है।\nLangChain से एक टूल इम्पोर्ट करने के लिए, `from_langchain()` मेथड का उपयोग करें।\n\nयहाँ बताया गया है कि आप LangChain वेब सर्च टूल का उपयोग करके परिचय के सर्च रिजल्ट को कैसे फिर से बना सकते हैं।\nइस टूल को काम करने के लिए `pip install langchain google-search-results -q` की आवश्यकता होगी।\n```python\nfrom langchain.agents import load_tools\n\nsearch_tool = Tool.from_langchain(load_tools([\"serpapi\"])[0])\n\nagent = CodeAgent(tools=[search_tool], model=model)\n\nagent.run(\"How many more blocks (also denoted as layers) are in BERT base encoder compared to the encoder from the architecture proposed in Attention is All You Need?\")\n```\n\n### अपने एजेंट के टूलबॉक्स को मैनेज करें\n\nआप एजेंट के टूलबॉक्स को `agent.tools` एट्रिब्यूट में एक टूल जोड़कर या बदलकर मैनेज कर सकते हैं, क्योंकि यह एक स्टैंडर्ड डिक्शनरी है।\n\nचलिए केवल डिफ़ॉल्ट टूलबॉक्स के साथ इनिशियलाइज़ किए गए मौजूदा एजेंट में `model_download_tool` जोड़ें।\n\n```python\nfrom smolagents import InferenceClientModel\n\nmodel = InferenceClientModel(model_id=\"Qwen/Qwen3-Next-80B-A3B-Thinking\")\n\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\nagent.tools[model_download_tool.name] = model_download_tool\n```\nअब हम नए टूल का लाभ उठा सकते हैं।\n\n```python\nagent.run(\n    \"Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub but reverse the letters?\"\n)\n```\n\n\n> [!TIP]\n> एजेंट में बहुत अधिक टूल्स न जोड़ने से सावधान रहें: यह कमजोर LLM इंजन को ओवरव्हेल्म कर सकता है।\n\n\n### टूल्स का कलेक्शन उपयोग करें\n\nआप `ToolCollection` ऑब्जेक्ट का उपयोग करके टूल कलेक्शंस का लाभ उठा सकते हैं। यह या तो हब से एक कलेक्शन या MCP सर्वर टूल्स को लोड करने का समर्थन करता है।\n\n#### हब में कलेक्शन से टूल कलेक्शन\n\nआप उस कलेक्शन के स्लग के साथ इसका लाभ उठा सकते हैं जिसका आप उपयोग करना चाहते हैं।\nफिर उन्हें अपने एजेंट को इनिशियलाइज़ करने के लिए एक लिस्ट के रूप में पास करें, और उनका उपयोग शुरू करें!\n\n```py\nfrom smolagents import ToolCollection, CodeAgent\n\nimage_tool_collection = ToolCollection.from_hub(\n    collection_slug=\"huggingface-tools/diffusion-tools-6630bb19a942c2306a2cdb6f\",\n    token=\"<YOUR_HUGGINGFACEHUB_API_TOKEN>\"\n)\nagent = CodeAgent(tools=[*image_tool_collection.tools], model=model, add_base_tools=True)\n\nagent.run(\"Please draw me a picture of rivers and lakes.\")\n```\n\nस्टार्ट को तेज करने के लिए, टूल्स केवल तभी लोड होते हैं जब एजेंट द्वारा कॉल किए जाते हैं।\n\n#### किसी भी MCP सर्वर से टूल कलेक्शन\n\n[glama.ai](https://glama.ai/mcp/servers) या [smithery.ai](https://smithery.ai/) पर उपलब्ध सैकड़ों MCP सर्वर्स से टूल्स का लाभ उठाएं।\n\nMCP सर्वर्स टूल्स को निम्नानुसार `ToolCollection` ऑब्जेक्ट में लोड किया जा सकता है:\n\n```py\nfrom smolagents import ToolCollection, CodeAgent\nfrom mcp import StdioServerParameters\n\nserver_parameters = StdioServerParameters(\n    command=\"uv\",\n    args=[\"--quiet\", \"pubmedmcp@0.1.3\"],\n    env={\"UV_PYTHON\": \"3.12\", **os.environ},\n)\n\nwith ToolCollection.from_mcp(server_parameters, trust_remote_code=True) as tool_collection:\n    agent = CodeAgent(tools=[*tool_collection.tools], add_base_tools=True)\n    agent.run(\"Please find a remedy for hangover.\")\n```"
  },
  {
    "path": "docs/source/ko/_config.py",
    "content": "# docstyle-ignore\nINSTALL_CONTENT = \"\"\"\n# Installation\n! pip install smolagents\n# To install from source instead of the last release, comment the command above and uncomment the following one.\n# ! pip install git+https://github.com/huggingface/smolagents.git\n\"\"\"\n\nnotebook_first_cells = [{\"type\": \"code\", \"content\": INSTALL_CONTENT}]\nblack_avoid_patterns = {\n    \"{processor_class}\": \"FakeProcessorClass\",\n    \"{model_class}\": \"FakeModelClass\",\n    \"{object_class}\": \"FakeObjectClass\",\n}\n"
  },
  {
    "path": "docs/source/ko/_toctree.yml",
    "content": "- title: Get started\n  sections:\n  - local: index\n    title: 소개\n  - local: installation\n    title: 설치 옵션\n  - local: guided_tour\n    title: 안내서\n- title: 튜토리얼\n  sections:\n  - local: tutorials/building_good_agents\n    title: 좋은 에이전트 구축하기\n  - local: tutorials/inspect_runs\n    title: 📊 텔레메트리로 에이전트 실행 검사하기\n#  - local: tutorials/tools\n#    title: 🛠️ Tools - in-depth guide\n#  - local: tutorials/secure_code_execution\n#    title: 🛡️ Secure code execution\n  - local: tutorials/memory\n    title: 📚 에이전트 메모리 관리\n- title: Conceptual guides\n  sections:\n#  - local: conceptual_guides/intro_agents\n#    title: 🤖 What are agents?\n  - local: conceptual_guides/react\n    title: 🤔 멀티스텝 에이전트는 어떻게 동작하나요?\n- title: 예제\n  sections:\n  - local: examples/text_to_sql\n    title: 스스로 오류를 수정하는 Text-to-SQL\n  - local: examples/rag\n    title: Agentic RAG로 지식 베이스 완전 정복하기\n  - local: examples/multiagents\n    title: 멀티 에이전트 시스템 오케스트레이션\n  - local: examples/web_browser\n    title: 비전 모델을 활용한 웹 브라우저 에이전트 만들기\n  - local: examples/using_different_models\n    title: 다양한 모델 사용하기\n  - local: examples/plan_customization\n    title: \"Human-in-the-Loop: 사용자와 상호작용하며 에이전트 계획 수정하기\"\n  - local: examples/async_agent\n    title: 에이전트를 활용한 비동기 애플리케이션\n- title: Reference\n  sections:\n  - local: reference/agents\n    title: 에이전트 관련 객체\n  - local: reference/models\n    title: 모델 관련 클래스\n    sections:\n    - title: 도구 관련 클래스\n      local: reference/tools\n#    - title: Built-in Tools\n#      local: reference/default_tools"
  },
  {
    "path": "docs/source/ko/conceptual_guides/react.md",
    "content": "# 멀티스텝 에이전트는 어떻게 동작하나요?[[how-do-multi-step-agents-work]]\n\nReAct 프레임워크([Yao et al., 2022](https://huggingface.co/papers/2210.03629))는 현재 에이전트 구축하는 가장 일반적인 접근 방식입니다.\n\nReAct라는 이름은 \"추론(Reason)\"과 \"행동(Act)\"을 결합한 것입니다. 실제로 이 구조를 따르는 에이전트는 주어진 작업을 해결하기 위해 필요한 만큼 여러 단계를 거칩니다. 각 단계는 추론 단계와 행동 단계로 이루어져 있으며, 행동 단계에서는 작업 해결에 가까워지도록 다양한 도구를 호출합니다.\n\n`smolagents`가 제공하는 모든 에이전트는 ReAct 프레임워크를 추상화한 단일 `MultiStepAgent` 클래스를 기반으로 합니다.\n\n이 클래스는 기본적으로 아래와 같은 루프로 동작하며, 기존 변수와 지식도 에이전트 로그에 함께 반영됩니다.\n\n초기화: 시스템 프롬프트는 `SystemPromptStep`에 저장되고, 사용자가 입력한 쿼리는 `TaskStep`에 기록됩니다.\n\nWhile 루프 (ReAct 루프):\n\n- `agent.write_memory_to_messages()`를 사용하여 에이전트 로그를 LLM이 읽을 수 있는 [채팅 메시지](https://huggingface.co/docs/transformers/en/chat_templating) 목록에 작성합니다.\n- 이 메시지를 `Model` 객체에 전송하여 응답을 받습니다. 에이전트는 응답을 파싱하여 액션(`ToolCallingAgent`의 경우 JSON blob, `CodeAgent`의 경우 코드 스니펫)을 추출합니다.\n- 액션을 실행하고 결과를 메모리에 기록합니다(`ActionStep`).\n- 각 단계가 끝날 때마다 `agent.step_callbacks`에 정의된 모든 콜백 함수를 실행합니다.\n\n**계획(planning)**이 활성화된 경우에는 주기적으로 계획을 수정하고 이를 `PlanningStep`에 저장할 수 있습니다. 이 과정에는 현재 작업과 관련된 사실을 메모리에 기록하는 것도 포함됩니다.\n\n`CodeAgent`의 경우, 이 과정은 아래 그림과 같이 나타납니다.\n\n<div class=\"flex justify-center\">\n    <img\n        src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/codeagent_docs.png\"\n    />\n</div>\n\n비디오를 통해 동작 과정을 확인해보세요.\n\n<div class=\"flex justify-center\">\n    <img\n        class=\"block dark:hidden\"\n        src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Agent_ManimCE.gif\"\n    />\n    <img\n        class=\"hidden dark:block\"\n        src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Agent_ManimCE.gif\"\n    />\n</div>\n\n`smolagents` 라이브러리는 두 가지 버전의 에이전트를 제공합니다.\n- 도구 호출을 Python 코드 스니펫 형태로 생성하는 [`CodeAgent`]\n- 많은 프레임워크에서 일반적으로 사용하는 방식처럼 도구 호출을 JSON 형태로 작성하는 [`ToolCallingAgent`]\n\n필요에 따라 두 방식 중 어느 것이든 사용할 수 있습니다. 예를 들어, 웹 브라우징처럼 각 페이지 상호작용 후 대기 시간이 필요한 경우 JSON 기반 도구 호출이 잘 맞을 수 있습니다.\n\n> [!TIP]\n> 멀티스텝 에이전트에 대해 더 알고 싶다면 허깅페이스 블로그의 [Open-source LLMs as LangChain Agents](https://huggingface.co/blog/open-source-llms-as-agents) 포스트를 읽어보세요."
  },
  {
    "path": "docs/source/ko/examples/async_agent.md",
    "content": "# 에이전트를 활용한 비동기 애플리케이션[[async-applications-with-agents]]\n\n이 가이드는 smolagents 라이브러리의 동기 에이전트를 Starlette 기반의 비동기 Python 웹 애플리케이션에 통합하는 방법을 설명합니다.\n비동기 Python과 에이전트 통합을 처음 접하는 사용자들이 동기 에이전트 로직과 비동기 웹 서버를 효과적으로 결합하는 모범 사례를 익힐 수 있도록 구성했습니다.\n\n## 개요[[overview]]\n\n- **Starlette**: Python에서 비동기 웹 애플리케이션을 구축하기 위한 경량 ASGI 프레임워크입니다.\n- **anyio.to_thread.run_sync**: 블로킹(동기) 코드를 백그라운드 스레드에서 실행하여 비동기 이벤트 루프를 차단하지 않도록 하는 유틸리티입니다.\n- **CodeAgent**: 프로그래밍 방식으로 작업을 해결할 수 있는 `smolagents` 라이브러리의 에이전트입니다.\n\n## 백그라운드 스레드를 사용하는 이유는?[[why-use-a-background-thread?]]\n\n`CodeAgent.run()`은 Python 코드를 동기적으로 실행합니다. 비동기 엔드포인트에서 직접 호출하면 Starlette의 이벤트 루프를 차단하여 성능과 확장성이 저하됩니다. `anyio.to_thread.run_sync`로 이 작업을 백그라운드 스레드로 위임하면 높은 동시성에서도 앱의 응답성과 효율성을 유지할 수 있습니다.\n\n## 예시 워크플로우[[example-workflow]]\n\n- Starlette 앱은 `task` 문자열이 포함된 JSON 페이로드를 받는 `/run-agent` 엔드포인트를 노출합니다.\n- 요청이 수신되면 `anyio.to_thread.run_sync`를 사용하여 백그라운드 스레드에서 에이전트가 실행됩니다.\n- 결과는 JSON 응답으로 반환됩니다.\n\n## CodeAgent를 활용한 Starlette 앱 구축[[building-a-starlette-app-with-a-codeagent]]\n\n### 1. 의존성 설치[[1.-install-dependencies]]\n\n```bash\npip install smolagents starlette anyio uvicorn\n```\n\n### 2. 애플리케이션 코드 (`main.py`)[[2.-application-code-(`main.py`)]]\n\n```python\nimport anyio.to_thread\nfrom starlette.applications import Starlette\nfrom starlette.requests import Request\nfrom starlette.responses import JSONResponse\nfrom starlette.routing import Route\n\nfrom smolagents import CodeAgent, InferenceClientModel\n\nagent = CodeAgent(\n    model=InferenceClientModel(model_id=\"Qwen/Qwen3-Next-80B-A3B-Thinking\"),\n    tools=[],\n)\n\nasync def run_agent(request: Request):\n    data = await request.json()\n    task = data.get(\"task\", \"\")\n    # Run the agent synchronously in a background thread\n    result = await anyio.to_thread.run_sync(agent.run, task)\n    return JSONResponse({\"result\": result})\n\napp = Starlette(routes=[\n    Route(\"/run-agent\", run_agent, methods=[\"POST\"]),\n])\n```\n\n### 3. 앱 실행[[3.-run-the-app]]\n\n```bash\nuvicorn async_agent.main:app --reload\n```\n\n### 4. 엔드포인트 테스트[[4.-test-the-endpoint]]\n\n```bash\ncurl -X POST http://localhost:8000/run-agent -H 'Content-Type: application/json' -d '{\"task\": \"What is 2+2?\"}'\n```\n\n**예상 응답:**\n\n```json\n{\"result\": \"4\"}\n```\n\n## 추가 자료[[further-reading]]\n\n- [Starlette 문서](https://www.starlette.io/)\n- [anyio 문서](https://anyio.readthedocs.io/)\n\n---\n\n전체 코드는 [`examples/async_agent`](https://github.com/huggingface/smolagents/tree/main/examples/async_agent)를 참고하세요.\n"
  },
  {
    "path": "docs/source/ko/examples/multiagents.md",
    "content": "# 멀티 에이전트 시스템 오케스트레이션 🤖🤝🤖\n\n[[Colab에서 열기]]\n\n이 노트북에서는 **멀티 에이전트 웹 브라우저**를 만들어보겠습니다. 이는 웹을 사용하여 문제를 해결하기 위해 여러 에이전트가 협력하는 에이전트 시스템입니다!\n\n멀티 에이전트는 간단한 계층 구조로 구성됩니다.\n\n```\n              +----------------+\n              | Manager agent  |\n              +----------------+\n                       |\n        _______________|______________\n       |                              |\nCode Interpreter            +------------------+\n    tool                    | Web Search agent |\n                            +------------------+\n                               |            |\n                        Web Search tool     |\n                                   Visit webpage tool\n```\n이 시스템을 설정해보겠습니다. \n\n다음 명령어를 실행하여 필요한 종속성을 설치합니다.\n\n```py\n!pip install smolagents[toolkit] --upgrade -q\n```\n\nInference Providers를 사용하기 위해 Hugging Face에 로그인합니다:\n\n```py\nfrom huggingface_hub import login\n\nlogin()\n```\n\n⚡️ 에이전트는 Hugging Face의 Inference API를 사용하는 `InferenceClientModel` 클래스를 통해 [Qwen/Qwen3-Next-80B-A3B-Thinking](https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Thinking)로 구동됩니다. Inference API를 사용하면 모든 오픈소스 모델을 빠르고 쉽게 실행할 수 있습니다.\n\n> [!TIP]\n> Inference Providers는 서버리스 추론 파트너가 지원하는 수백 개의 모델에 대한 액세스를 제공합니다. 지원되는 프로바이더 목록은 [여기](https://huggingface.co/docs/inference-providers/index)에서 확인할 수 있습니다.\n\n```py\nmodel_id = \"Qwen/Qwen3-Next-80B-A3B-Thinking\"\n```\n\n## 🔍 웹 검색 도구 생성\n\n웹 브라우징을 위해 Google 검색과 동등한 기능을 제공하는 기본 [`WebSearchTool`] 도구를 이미 사용할 수 있습니다.\n\n하지만 `WebSearchTool`에서 찾은 페이지를 확인할 수 있는 기능도 필요합니다.\n이를 위해 라이브러리에 내장된 `VisitWebpageTool`을 사용할 수도 있지만, 작동 원리를 이해하기 위해 직접 구현해보겠습니다.\n\n`markdownify`를 사용하여 `VisitWebpageTool` 도구를 처음부터 만들어보겠습니다.\n\n```py\nimport re\nimport requests\nfrom markdownify import markdownify\nfrom requests.exceptions import RequestException\nfrom smolagents import tool\n\n\n@tool\ndef visit_webpage(url: str) -> str:\n    \"\"\"주어진 URL의 웹페이지에 접속하여 그 내용을 마크다운 형식의 반환합니다.\n\n    매개변수:\n        url: 방문할 웹페이지의 URL.\n\n    반환값:\n        마크다운으로 변환된 웹페이지 내용, 또는 요청이 실패할 경우 오류 메시지.\n    \"\"\"\n    try:\n        # URL에 GET 요청 전송\n        response = requests.get(url)\n        response.raise_for_status()  # 잘못된 상태 코드에 대해 예외 발생\n\n        # HTML 내용을 마크다운으로 변환\n        markdown_content = markdownify(response.text).strip()\n\n        # 여러 줄 바꿈 제거\n        markdown_content = re.sub(r\"\\n{3,}\", \"\\n\\n\", markdown_content)\n\n        return markdown_content\n\n    except RequestException as e:\n        return f\"Error fetching the webpage: {str(e)}\"\n    except Exception as e:\n        return f\"An unexpected error occurred: {str(e)}\"\n```\n\n이제 도구를 초기화하고 테스트해보겠습니다!\n\n```py\nprint(visit_webpage(\"https://en.wikipedia.org/wiki/Hugging_Face\")[:500])\n```\n\n## 멀티 에이전트 시스템 구축 🤖🤝🤖\n\n이제 `search`와 `visit_webpage` 도구가 모두 준비되었으므로, 이를 사용하여 웹 에이전트를 생성할 수 있습니다.\n\n이 에이전트에 어떤 구성을 선택할까요?\n- 웹 브라우징은 병렬 도구 호출이 필요없는 단일 타임라인 작업이므로, JSON 도구 호출 방식이 적합합니다. 따라서 `ToolCallingAgent`를 선택합니다.\n- 또한 웹 검색은 올바른 답을 찾기 전에 많은 페이지를 탐색해야 하는 경우가 있으므로, `max_steps`를 10으로 늘리는 것이 좋습니다.\n\n```py\nfrom smolagents import (\n    CodeAgent,\n    ToolCallingAgent,\n    InferenceClientModel,\n    WebSearchTool,\n)\n\nmodel = InferenceClientModel(model_id=model_id)\n\nweb_agent = ToolCallingAgent(\n    tools=[WebSearchTool(), visit_webpage],\n    model=model,\n    max_steps=10,\n    name=\"web_search_agent\",\n    description=\"Runs web searches for you.\",\n)\n```\n\n이 에이전트에 `name`과 `description` 속성을 부여했습니다. 이는 이 에이전트가 매니저 에이전트에 의해 호출될 수 있도록 하는 필수 속성입니다.\n\n그 다음 매니저 에이전트를 생성하고, 초기화 시 `managed_agents` 인수에 관리되는 에이전트를 전달합니다.\n\n이 에이전트는 계획과 사고를 담당하므로, 고급 추론이 유용할 것입니다. 따라서 `CodeAgent`가 잘 작동할 것입니다.\n\n또한 현재 연도를 포함하고 추가 데이터 계산을 수행하는 질문을 하고 싶으므로, 에이전트가 이러한 패키지를 필요로 할 경우에 대비해 `additional_authorized_imports=[\"time\", \"numpy\", \"pandas\"]`를 추가해보겠습니다.\n\n```py\nmanager_agent = CodeAgent(\n    tools=[],\n    model=model,\n    managed_agents=[web_agent],\n    additional_authorized_imports=[\"time\", \"numpy\", \"pandas\"],\n)\n```\n\n이게 전부입니다! 이제 시스템을 실행해보겠습니다! 계산과 연구가 모두 필요한 질문을 선택합니다.\n\n```py\nanswer = manager_agent.run(\"LLM 훈련이 현재 속도로 2030년까지 계속 확장된다면, 2030년까지 가장 큰 훈련 실행에 전력을 공급하는 데 필요한 전력량은 GW 단위로 얼마가 될까요? 이는 일부 국가들과 비교했을 때 무엇에 해당할까요? 사용된 모든 수치에 대한 출처를 제공해주세요.\")\n```\n\n답변으로 이런 보고서를 받습니다.\n```\n현재 성장 전망과 에너지 소비량 추정에 따르면,\n2030년까지 LLM 교육이 현재 속도로 계속 확장된다면 다음과 같이 예상됩니다.\n\n1. 2030년까지 가장 큰 훈련 실행에 전력을 공급하는 데 필요한 전력량은 약 303.74 GW가 될 것이며, \n이는 연간 약 2,660,762 GWh로 환산됩니다.\n\n2. 국가별 전력 소비량 비교\n   - 중국 총 전력 소비량의 약 34%에 해당합니다.\n   - 인도(184%), 러시아(267%), 일본(291%)의 총 전력 소비량을 초과할 것입니다.\n   - 이탈리아나 멕시코 같은 국가들의 전력 소비량의 거의 9배가 됩니다.\n\n3. 수치 출처\n   - 미래 LLM 훈련을 위한 5 GW의 초기 추정치는 AWS CEO Matt Garman에서 나온 것입니다.\n   - 성장 예측은 Springs의 시장 조사에서 79.80%의 CAGR을 사용했습니다.\n   - 국가 전력 소비 데이터는 주로 2021년 기준으로 미국 에너지 정보 관리청에서 나온 것입니다.\n```\n\n[스케일링 가설](https://gwern.net/scaling-hypothesis)이 계속 참이라면 상당히 큰 발전소가 필요할 것 같습니다.\n\n에이전트들이 작업을 해결하기 위해 효율적으로 협력했습니다! ✅\n\n💡 이 오케스트레이션을 더 많은 에이전트로 쉽게 확장할 수 있습니다: 하나는 코드 실행을, 다른 하나는 웹 검색을,  또 다른 하나는 파일 처리를 담당하는 식으로...\n"
  },
  {
    "path": "docs/source/ko/examples/plan_customization.md",
    "content": "# Human-in-the-Loop: 사용자와 상호작용하며 에이전트 계획 수정하기 [[humanintheloop-customize-agent-plan-interactively]]\n\n이 페이지에서는 smolagents 라이브러리의 고급 사용법을 소개합니다. 특히 사용자와의 상호작용을 통한 계획 생성, 계획 수정, 그리고 에이전트 워크플로에서의 메모리 보존을 위한 Human-in-the-Loop (HITL) 접근 방식을 중점적으로 설명합니다.\n예제는 `examples/plan_customization/plan_customization.py`의 코드를 기반으로 합니다.\n\n## 개요 [[overview]]\n\n이 예제는 다음과 같은 Human-in-the-Loop 전략을 구현하는 방법을 안내합니다.\n\n- 단계 콜백(step callback)을 사용하여 계획 생성 후 에이전트 실행 중단하기\n- 사용자가 실행 전에 에이전트의 계획을 검토하고 수정할 수 있도록 지원 (Human-in-the-Loop)\n- 에이전트의 메모리를 보존하면서 실행 재개하기\n- 사용자 피드백을 기반으로 계획을 동적으로 업데이트하여 사용자가 제어권을 유지하도록 지원\n\n## 핵심 개념 [[key-concepts]]\n\n### 단계 콜백을 이용한 계획 중단 [[step-callbacks-for-plan-interruption]]\n\n에이전트가 계획을 생성한 후 일시 중지하도록 설정할 수 있습니다. 이는 PlanningStep에 단계 콜백을 등록하여 구현합니다.\n\n```python\nagent = CodeAgent(\n    model=InferenceClientModel(),\n    tools=[DuckDuckGoSearchTool()],\n    planning_interval=5,  # 5단계마다 계획\n    step_callbacks={PlanningStep: interrupt_after_plan},\n    max_steps=10,\n    verbosity_level=1\n)\n```\n\n### Human-in-the-Loop: 대화형 계획 검토 및 수정 [[humanintheloop-interactive-plan-review-and-modification]]\n\n에이전트가 계획을 생성하면, 콜백 함수가 해당 계획을 사용자에게 보여주고 다음 옵션 중 하나를 선택하도록 안내합니다.\n\n1. 계획 승인\n2. 계획 수정\n3. 실행 취소\n\n예제 상호작용:\n\n```\n============================================================\n🤖 에이전트 계획 생성됨\n============================================================\n1. 최근 AI 발전 사항 검색\n2. 상위 결과 분석\n3. 가장 중요한 3가지 돌파구 요약\n4. 각 돌파구에 대한 소스 포함\n============================================================\n\n옵션을 선택하세요.\n1. 계획 승인\n2. 계획 수정\n3. 취소\n선택 (1-3):\n```\n\n이 Human-in-the-Loop 단계를 통해 사용자는 실행이 계속되기 전에 개입하여 계획을 검토하고 수정할 수 있으며, 이를 통해 에이전트의 행동이 사용자의 의도와 일치하도록 보장할 수 있습니다.\n\n사용자가 수정을 선택하면 계획을 직접 편집할 수 있으며, 업데이트된 계획은 이후 실행 단계에서 사용됩니다.\n\n### 메모리 보존 및 실행 재개 [[memory-preservation-and-resuming-execution]]\n\n`reset=False` 옵션으로 에이전트를 실행하면 이전의 모든 단계와 메모리가 보존됩니다. 이를 통해 중단 또는 계획 수정 후에도 실행을 이어서 진행할 수 있습니다.\n\n```python\n# 첫 번째 실행 (중단될 수 있음)\nagent.run(task, reset=True)\n\n# 보존된 메모리로 재개\nagent.run(task, reset=False)\n```\n\n### 에이전트 메모리 검사 [[inspecting-agent-memory]]\n\n에이전트의 메모리를 검사하여 지금까지 수행된 모든 단계를 확인할 수 있습니다.\n\n```python\nprint(f\"현재 메모리에 {len(agent.memory.steps)}개의 단계가 포함되어 있습니다.\")\nfor i, step in enumerate(agent.memory.steps):\n    step_type = type(step).__name__\n    print(f\"  {i+1}. {step_type}\")\n```\n\n## Human-in-the-Loop 워크플로우 예시 [[example-humanintheloop-workflow]]\n\n1. 에이전트가 복잡한 작업을 받아 실행을 시작합니다.\n2. 계획 단계가 생성되고, 사용자 검토를 위해 실행이 일시 중지됩니다.\n3. 사용자가 계획을 검토하고 필요에 따라 수정합니다 (Human-in-the-Loop).\n4. 승인되거나 수정된 계획으로 실행을 재개합니다.\n5. 모든 단계는 향후 실행을 위해 보존되어 투명성과 제어권을 보장합니다.\n\n## 오류 처리 [[error-handling]]\n\n예제는 다음에 대한 오류 처리를 포함합니다.\n- 사용자 취소\n- 계획 수정 오류\n- 실행 재개 실패\n\n## 요구사항 [[requirements]]\n\n- smolagents 라이브러리\n- DuckDuckGoSearchTool (smolagents에 포함)\n- InferenceClientModel (🤗 Hugging Face API 토큰 필요)\n\n## 교육적 가치 [[educational-value]]\n\n이 예제는 다음을 시연합니다.\n- 사용자 정의 에이전트 동작을 위한 단계 콜백 구현 방법\n- 다중 단계 에이전트의 메모리 관리 기법\n- 에이전트 시스템의 사용자 상호작용 패턴\n- 동적 에이전트 제어를 위한 계획 수정 기법\n- 대화형 에이전트 시스템의 오류 처리 방법\n\n---\n\n전체 코드는 [`examples/plan_customization`](https://github.com/huggingface/smolagents/tree/main/examples/plan_customization)에서 확인하세요.\n"
  },
  {
    "path": "docs/source/ko/examples/rag.md",
    "content": "# Agentic RAG[[agentic-rag]]\n\n[[open-in-colab]]\n\n## RAG(검색 증강 생성) 소개[[introduction-to-retrieval-augmented-generation-rag]]\n\n검색 증강 생성(Retrieval-Augmented Generation, RAG)은 대규모 언어 모델의 능력과 외부 지식 검색을 결합하여 더 정확하고 사실에 기반을 두며 문맥에 맞는 응답을 생성합니다. RAG의 핵심은 \"대규모 언어 모델을 사용해 사용자 쿼리에 답변을 제공하되, 지식 베이스에서 검색된 정보에 기반하여 답변하는 것\"입니다.\n\n### RAG를 사용하는 이유[[why-use-rag]]\n\nRAG는 기본 대규모 언어 모델이나 미세 조정된 모델을 사용하는 것에 비해 다음과 같은 몇 가지 중요한 장점을 제공합니다.\n\n1. **사실 기반 생성**: 답변의 근거를 검색 결과에 두어 환각 현상을 줄입니다.\n2. **도메인 특화**: 모델을 다시 훈련시키지 않고도 특정 도메인의 지식을 제공합니다.\n3. **최신 지식 반영**: 모델의 훈련 시점 이후의 정보에도 접근할 수 있습니다.\n4. **투명성**: 생성된 콘텐츠의 출처를 인용할 수 있습니다.\n5. **제어**: 모델이 접근할 수 있는 정보를 세밀하게 제어할 수 있습니다.\n\n### 전통적인 RAG의 한계[[limitations-of-traditional-rag]]\n\n이러한 장점에도 불구하고, 전통적인 RAG 접근 방식은 다음과 같은 몇 가지 문제가 있습니다.\n\n- **단일 검색 단계**: 초기 검색 결과가 좋지 않으면 최종 생성 결과의 품질이 저하됩니다.\n- **쿼리-문서 불일치**: 사용자 쿼리(주로 질문)가 답변을 포함하는 문서(주로 서술문)와 잘 일치하지 않을 수 있습니다.\n- **제한된 추론**: 단순한 RAG 파이프라인은 다단계 논리적 추론이나 쿼리 정제를 허용하지 않습니다.\n- **컨텍스트 윈도우 제약**: 검색된 문서는 모델의 컨텍스트 윈도우 크기에 맞춰야 합니다.\n\n## Agentic RAG: 더 강력한 접근 방식[[agentic-rag-a-more-powerful-approach]]\n\n**Agentic RAG** 시스템, 즉 검색 능력을 갖춘 에이전트를 구현함으로써 이러한 한계를 극복할 수 있습니다. 이 접근 방식은 RAG를 경직된 파이프라인에서 논리적 추론 중심의 상호작용적 프로세스로 탈바꿈시키는 방식입니다.\n\n### Agentic RAG의 주요 장점[[key-benefits-of-agentic-rag]]\n\n검색 도구를 갖춘 에이전트는 다음을 수행할 수 있습니다.\n\n1. ✅ **최적화된 쿼리 생성**: 에이전트는 사용자 질문을 검색에 적합한 쿼리로 변환할 수 있습니다.\n2. ✅ **다중 검색 수행**: 에이전트는 필요에 따라 반복적으로 정보를 검색할 수 있습니다.\n3. ✅ **검색 결과 기반 논리적 추론**: 에이전트는 여러 소스의 정보를 분석, 종합하고 결론을 도출할 수 있습니다.\n4. ✅ **자체 평가 및 개선**: 에이전트는 검색 결과를 평가하고 접근 방식을 조정할 수 있습니다.\n\n이 접근 방식은 다음과 같은 Agentic RAG 기술을 자연스럽게 구현합니다.\n- **가상 문서 임베딩(HyDE)**: 사용자 쿼리를 직접 사용하는 대신, 에이전트가 검색에 최적화된 쿼리를 생성합니다 ([논문 참조](https://huggingface.co/papers/2212.10496))\n- **자가 쿼리 정제**: 에이전트가 초기 결과를 분석하고 정제된 쿼리로 후속 검색을 수행할 수 있습니다 ([기술 참조](https://docs.llamaindex.ai/en/stable/examples/evaluation/RetryQuery/))\n\n## Agentic RAG 시스템 구축하기[[building-an-agentic-rag-system]]\n\n이제 단계별로 Agentic RAG 시스템을 구축해 보겠습니다. 이 예제에서는 허깅 페이스 Transformers 라이브러리 설명서를 검색해 질문에 답할 수 있는 에이전트를 만들어 보겠습니다.\n\n아래 코드 스니펫을 따라 하거나, smolagents GitHub 리포지토리에서 전체 예제를 확인할 수 있습니다: [examples/rag.py](https://github.com/huggingface/smolagents/blob/main/examples/rag.py).\n\n### 1단계: 필요한 의존성 설치하기[[step-1-install-required-dependencies]]\n\n먼저, 필요한 패키지를 설치해야 합니다.\n\n```bash\npip install smolagents pandas langchain langchain-community sentence-transformers datasets python-dotenv rank_bm25 --upgrade\n```\n\n허깅 페이스의 추론 API를 사용하려면 API 토큰을 설정해야 합니다.\n\n```python\n# 환경 변수 로드 (HF_TOKEN 포함)\nfrom dotenv import load_dotenv\nload_dotenv()\n```\n\n### 2단계: 지식 베이스 준비하기[[step-2-prepare-the-knowledge-base]]\n\n허깅 페이스 설명서가 포함된 데이터 세트를 불러와 검색에 사용할 준비를 해보겠습니다.\n\n```python\nimport datasets\nfrom langchain.docstore.document import Document\nfrom langchain.text_splitter import RecursiveCharacterTextSplitter\nfrom langchain_community.retrievers import BM25Retriever\n\n# 허깅 페이스 설명서 데이터 세트 로드\nknowledge_base = datasets.load_dataset(\"m-ric/huggingface_doc\", split=\"train\")\n\n# Transformers 라이브러리 설명서만 포함하도록 필터링\nknowledge_base = knowledge_base.filter(lambda row: row[\"source\"].startswith(\"huggingface/transformers\"))\n\n# 데이터 세트 항목을 메타데이터가 있는 Document 객체로 변환\nsource_docs = [\n    Document(page_content=doc[\"text\"], metadata={\"source\": doc[\"source\"].split(\"/\")[1]})\n    for doc in knowledge_base\n]\n\n# 더 나은 검색을 위해 문서를 작은 청크로 분할\ntext_splitter = RecursiveCharacterTextSplitter(\n    chunk_size=500,  # 청크당 문자 수\n    chunk_overlap=50,  # 컨텍스트 유지를 위한 청크 간 중첩\n    add_start_index=True,\n    strip_whitespace=True,\n    separators=[\"\\n\\n\", \"\\n\", \".\", \" \", \"\"],  # 분할 우선순위\n)\ndocs_processed = text_splitter.split_documents(source_docs)\n\nprint(f\"Knowledge base prepared with {len(docs_processed)} document chunks\")\n```\n\n### 3단계: 검색 도구 만들기[[step-3-create-a-retriever-tool]]\n\n이제 에이전트가 지식 베이스에서 정보를 검색하는 데 사용할 수 있는 사용자 정의 도구를 만들어 보겠습니다.\n\n```python\nfrom smolagents import Tool\n\nclass RetrieverTool(Tool):\n    name = \"retriever\"\n    description = \"의미 기반 검색을 사용하여 쿼리에 답변하는 데 가장 관련성이 높은 transformers 설명서 부분을 검색합니다.\"\n    inputs = {\n        \"query\": {\n            \"type\": \"string\",\n            \"description\": \"수행할 쿼리입니다. 대상 문서와 의미적으로 가까워야 합니다. 질문보다는 긍정문을 사용하세요.\",\n        }\n    }\n    output_type = \"string\"\n\n    def __init__(self, docs, **kwargs):\n        super().__init__(**kwargs)\n        # 처리된 문서로 검색기 초기화\n        self.retriever = BM25Retriever.from_documents(\n            docs, k=10  # 가장 관련성 높은 상위 10개 문서 반환\n        )\n\n    def forward(self, query: str) -> str:\n        \"\"\"제공된 쿼리를 기반으로 검색을 실행합니다.\"\"\"\n        assert isinstance(query, str), \"Your search query must be a string\"\n\n        # 관련 문서 검색\n        docs = self.retriever.invoke(query)\n\n        # 가독성을 위해 검색된 문서 형식 지정\n        return \"\\nRetrieved documents:\\n\" + \"\".join(\n            [\n                f\"\\n\\n===== Document {str(i)} =====\\n\" + doc.page_content\n                for i, doc in enumerate(docs)\n            ]\n        )\n\n# 처리된 문서로 검색 도구 초기화\nretriever_tool = RetrieverTool(docs_processed)\n```\n\n> [!TIP]\n> 단순성과 속도를 위해 어휘 검색 방식인 BM25를 사용하고 있습니다. 실제 서비스 환경에서는 검색 품질을 높이기 위해 임베딩을 활용한 의미 기반 검색을 사용하는 것이 좋습니다. 고품질 임베딩 모델은 [MTEB 리더보드](https://huggingface.co/spaces/mteb/leaderboard)에서 확인하세요.\n\n### 4단계: 고급 검색 에이전트 만들기[[step-4-create-an-advanced-retrieval-agent]]\n\n다음으로 앞서 만든 검색 도구를 활용해 질문에 답할 수 있는 에이전트를 구성해 봅시다.\n\n```python\nfrom smolagents import InferenceClientModel, CodeAgent\n\n# 검색 도구로 에이전트 초기화\nagent = CodeAgent(\n    tools=[retriever_tool],  # 에이전트가 사용할 수 있는 도구 목록\n    model=InferenceClientModel(),  # 기본 모델 \"Qwen/Qwen2.5-Coder-32B-Instruct\"\n    max_steps=4,  # 논리적 추론 단계 수 제한\n    verbosity_level=2,  # 에이전트의 상세한 논리적 추론 과정 표시\n)\n\n# 특정 모델을 사용하려면 다음과 같이 지정할 수 있습니다:\n# model=InferenceClientModel(model_id=\"meta-llama/Llama-3.3-70B-Instruct\")\n```\n\n> [!TIP]\n> Inference Provider는 서버리스 추론 파트너가 제공하는 수백 개의 모델에 대한 액세스를 제공합니다. 지원되는 제공업체 목록은 [여기](https://huggingface.co/docs/inference-providers/index)에서 찾을 수 있습니다.\n\n### 5단계: 에이전트를 실행하여 질문에 답하기[[step-5-run-the-agent-to-answer-questions]]\n\n마지막으로 에이전트를 실행해 Transformers 관련 질문에 답해 보겠습니다.\n\n```python\n# 정보 검색이 필요한 질문하기\nquestion = \"For a transformers model training, which is slower, the forward or the backward pass?\"\n\n# 에이전트를 실행하여 답변 얻기\nagent_output = agent.run(question)\n\n# 최종 답변 표시\nprint(\"\\nFinal answer:\")\nprint(agent_output)\n```\n\n## Agentic RAG의 실제 적용 사례[[practical-applications-of-agentic-rag]]\n\nAgentic RAG 시스템은 다양한 사용 사례에 적용될 수 있습니다.\n\n1. **기술 문서 지원**: 사용자가 복잡한 기술 문서를 탐색하는 데 도움을 줍니다.\n2. **연구 논문 분석**: 과학 논문에서 정보를 추출하고 종합합니다.\n3. **법률 문서 검토**: 법률 문서에서 관련 판례와 조항을 찾습니다.\n4. **고객 지원**: 제품 설명서와 지식 베이스를 기반으로 질문에 답변합니다.\n5. **교육 튜터링**: 교과서와 학습 자료를 기반으로 설명을 제공합니다.\n\n## 결론[[conclusion]]\n\nAgentic RAG는 전통적인 RAG 파이프라인을 뛰어넘는 중요한 발전을 의미합니다. 대형 언어 모델 에이전트의 추론 능력과 검색 시스템의 사실 기반을 결합함으로써, 우리는 더 강력하고 유연하며 정확한 정보 시스템을 구축할 수 있습니다.\n\n저희가 보여드린 접근 방식은 다음과 같은 특징이 있습니다:\n- 단일 단계 검색의 한계를 극복합니다.\n- 지식 베이스와의 상호작용이 더 자연스러워집니다.\n- 자체 평가와 쿼리 정제를 통해 지속해서 개선할 수 있는 프레임워크를 제공합니다.\n\n자신만의 Agentic RAG 시스템을 구축할 때에는, 다양한 검색 방법과 에이전트 아키텍처, 지식 소스를 실험하며 사용 사례에 최적화된 구성을 찾아보세요."
  },
  {
    "path": "docs/source/ko/examples/text_to_sql.md",
    "content": "# Text-to-SQL[[text-to-sql]]\n\n[[open-in-colab]]\n\n이 튜토리얼에서는 `smolagents`를 사용해 SQL을 다루는 에이전트를 구현해보겠습니다.\n\n> 먼저 중요한 질문 하나로 시작하겠습니다. 그냥 간단하게 일반적인 text-to-SQL 파이프라인을 쓰면 안 될까요?\n\n표준 text-to-SQL 파이프라인은 안정성이 떨어지는 경우가 많습니다. 쿼리가 잘못 생성될 수 있고, 심지어는 오류 없이 틀리거나 쓸모없는 결과를 반환할 수도 있습니다.\n\n👉 반면, 에이전트 시스템은 출력 결과를 비판적으로 점검할 수 있고 쿼리를 수정할 필요가 있는지 스스로 결정할 수 있이 성능이 크게 향상됩니다.\n\n이제 이 에이전트를 직접 만들어봅시다! 💪\n\n아래 명령어를 실행해 필요한 의존성을 설치하세요:\n```bash\n!pip install smolagents python-dotenv sqlalchemy --upgrade -q\n```\n\n추론 프로바이더를 호출하려면 환경 변수 `HF_TOKEN`에 유효한 토큰이 설정되어 있어야 합니다.\npython-dotenv를 이용해 환경 변수를 불러오겠습니다.\n```py\nfrom dotenv import load_dotenv\nload_dotenv()\n```\n\n다음으로, SQL 환경을 구성하겠습니다:\n```py\nfrom sqlalchemy import (\n    create_engine,\n    MetaData,\n    Table,\n    Column,\n    String,\n    Integer,\n    Float,\n    insert,\n    inspect,\n    text,\n)\n\nengine = create_engine(\"sqlite:///:memory:\")\nmetadata_obj = MetaData()\n\ndef insert_rows_into_table(rows, table, engine=engine):\n    for row in rows:\n        stmt = insert(table).values(**row)\n        with engine.begin() as connection:\n            connection.execute(stmt)\n\ntable_name = \"receipts\"\nreceipts = Table(\n    table_name,\n    metadata_obj,\n    Column(\"receipt_id\", Integer, primary_key=True),\n    Column(\"customer_name\", String(16), primary_key=True),\n    Column(\"price\", Float),\n    Column(\"tip\", Float),\n)\nmetadata_obj.create_all(engine)\n\nrows = [\n    {\"receipt_id\": 1, \"customer_name\": \"Alan Payne\", \"price\": 12.06, \"tip\": 1.20},\n    {\"receipt_id\": 2, \"customer_name\": \"Alex Mason\", \"price\": 23.86, \"tip\": 0.24},\n    {\"receipt_id\": 3, \"customer_name\": \"Woodrow Wilson\", \"price\": 53.43, \"tip\": 5.43},\n    {\"receipt_id\": 4, \"customer_name\": \"Margaret James\", \"price\": 21.11, \"tip\": 1.00},\n]\ninsert_rows_into_table(rows, receipts)\n```\n\n### 에이전트 만들기[[build-our-agent]]\n\n이제 도구를 활용해 SQL 테이블을 조회할 수 있도록 만들어봅시다.\n\n툴의 설명 속성은 에이전트 시스템에 의해 LLM 프롬프트에 포함되는 부분으로, LLM이 해당 도구를 어떻게 사용할 수 있는지에 대한 정보를 제공합니다. 바로 이 부분에 우리가 정의한 SQL 테이블의 설명을 작성하면 됩니다.\n\n```py\ninspector = inspect(engine)\ncolumns_info = [(col[\"name\"], col[\"type\"]) for col in inspector.get_columns(\"receipts\")]\n\ntable_description = \"Columns:\\n\" + \"\\n\".join([f\"  - {name}: {col_type}\" for name, col_type in columns_info])\nprint(table_description)\n```\n\n```text\nColumns:\n  - receipt_id: INTEGER\n  - customer_name: VARCHAR(16)\n  - price: FLOAT\n  - tip: FLOAT\n```\n\n이제 우리만의 툴을 만들어봅시다. 도구은 아래와 같은 요소를 필요로 합니다. (자세한 내용은 [도구 문서](../tutorials/tools)를 참고하세요)\n- 인자(`Args:`) 목록이 포함된 docstring\n- 입력과 출력에 대한 타입 힌트\n\n```py\nfrom smolagents import tool\n\n@tool\ndef sql_engine(query: str) -> str:\n    \"\"\"\n    테이블에 SQL 쿼리를 수행할 수 있습니다. 결과를 문자열로 반환합니다.\n    테이블 이름은 'receipts'이며, 설명은 다음과 같습니다:\n        Columns:\n        - receipt_id: INTEGER\n        - customer_name: VARCHAR(16)\n        - price: FLOAT\n        - tip: FLOAT\n\n    Args:\n        query: 수행할 쿼리입니다. 올바른 SQL이어야 합니다.\n    \"\"\"\n    output = \"\"\n    with engine.connect() as con:\n        rows = con.execute(text(query))\n        for row in rows:\n            output += \"\\n\" + str(row)\n    return output\n```\n\n이제 이 도구를 사용하는 에이전트를 만들어보겠습니다.\n\n여기서는 smolagent의 메인 에이전트 클래스인 `CodeAgent`를 사용합니다. `CodeAgent`는 코드로 액션을 작성하고 ReAct 프레임워크에 따라 이전 출력 결과를 반복적으로 개선할 수 있습니다.\n\n모델은 에이전트 시스템을 구동하는 LLM을 의미합니다. `InferenceClientModel`을 사용하면 허깅페이스의 Inference API를 통해 서버리스 또는 Dedicated 엔드포인트 방식으로 LLM을 호출할 수 있으며, 필요에 따라 다른 사설 API를 사용할 수도 있습니다.\n\n```py\nfrom smolagents import CodeAgent, InferenceClientModel\n\nagent = CodeAgent(\n    tools=[sql_engine],\n    model=InferenceClientModel(model_id=\"meta-llama/Llama-3.1-8B-Instruct\"),\n)\nagent.run(\"Can you give me the name of the client who got the most expensive receipt?\")\n```\n\n### 레벨 업: 테이블 조인[[level-2-table-joins]]\n\n이제 좀 더 어려운 과제를 해결해 볼까요? 에이전트가 여러 테이블에 걸친 조인을 처리하도록 만들어 보겠습니다.\n\n이를 위해 각 `receipt_id`에 해당하는 웨이터의 이름을 기록하는 두 번째 테이블을 만들어 보겠습니다.\n\n```py\ntable_name = \"waiters\"\nwaiters = Table(\n    table_name,\n    metadata_obj,\n    Column(\"receipt_id\", Integer, primary_key=True),\n    Column(\"waiter_name\", String(16), primary_key=True),\n)\nmetadata_obj.create_all(engine)\n\nrows = [\n    {\"receipt_id\": 1, \"waiter_name\": \"Corey Johnson\"},\n    {\"receipt_id\": 2, \"waiter_name\": \"Michael Watts\"},\n    {\"receipt_id\": 3, \"waiter_name\": \"Michael Watts\"},\n    {\"receipt_id\": 4, \"waiter_name\": \"Margaret James\"},\n]\ninsert_rows_into_table(rows, waiters)\n```\n테이블이 변경되었기 때문에 LLM이 테이블 정보를 올바르게 활용할 수 있도록 `sql_engine`의 설명을 업데이트하겠습니다.\n\n```py\nupdated_description = \"\"\"Allows you to perform SQL queries on the table. Beware that this tool's output is a string representation of the execution output.\nIt can use the following tables:\"\"\"\n\ninspector = inspect(engine)\nfor table in [\"receipts\", \"waiters\"]:\n    columns_info = [(col[\"name\"], col[\"type\"]) for col in inspector.get_columns(table)]\n\n    table_description = f\"Table '{table}':\\n\"\n\n    table_description += \"Columns:\\n\" + \"\\n\".join([f\"  - {name}: {col_type}\" for name, col_type in columns_info])\n    updated_description += \"\\n\\n\" + table_description\n\nprint(updated_description)\n```\n이번 요청은 이전보다 조금 더 어려우므로, 더 강력한 [Qwen/Qwen3-Next-80B-A3B-Thinking](https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Thinking) 모델을 사용하도록 LLM 엔진을 바꾸겠습니다!\n\n```py\nsql_engine.description = updated_description\n\nagent = CodeAgent(\n    tools=[sql_engine],\n    model=InferenceClientModel(model_id=\"Qwen/Qwen3-Next-80B-A3B-Thinking\"),\n)\n\nagent.run(\"Which waiter got more total money from tips?\")\n```\n바로 성공입니다! 놀라울 만큼 간단하게 설정되지 않았나요?\n\n이번 예제는 여기까지입니다! 지금까지 다음과 같은 개념들을 살펴보았습니다.\n- 새로운 도구 만들기\n- 도구 설명 업데이트하기\n- 더 강력한 LLM으로 에이전트 추론 능력 향상시키기\n\n✅ 이제 여러분이 꿈꿔왔던 text-to-SQL 시스템을 직접 만들어 보세요! ✨\n"
  },
  {
    "path": "docs/source/ko/examples/using_different_models.md",
    "content": "# 다양한 모델 사용하기 [[using-different-models]]\n\n[[open-in-colab]]\n\n`smolagents`는 다양한 프로바이더의 여러 언어 모델을 사용할 수 있는 유연한 프레임워크를 제공합니다.\n이 가이드는 에이전트와 함께 다양한 모델 유형을 사용하는 방법을 보여줍니다.\n\n## 사용 가능한 모델 유형 [[available-model-types]]\n\n`smolagents`는 기본적으로 여러 모델 유형을 지원합니다:\n1. [`InferenceClientModel`]: Hugging Face의 추론 API를 사용하여 모델에 접근\n2. [`TransformersModel`]: 🤗 Transformers 라이브러리를 사용하여 로컬에서 모델 실행\n3. [`VLLMModel`]: 최적화된 서빙으로 빠른 추론을 위해 vLLM 사용\n4. [`MLXModel`]: MLX를 사용하여 Apple Silicon 디바이스에 최적화\n5. [`LiteLLMModel`]: LiteLLM을 통해 수백 개의 대규모 언어 모델에 접근 제공\n6. [`LiteLLMRouterModel`]: 여러 모델 간에 요청을 분산\n7. [`OpenAIModel`]: OpenAI 호환 API를 구현하는 모든 프로바이더에 접근 제공\n8. [`AzureOpenAIModel`]: Azure의 OpenAI 서비스 사용\n9. [`AmazonBedrockModel`]: AWS Bedrock의 API에 연결\n\n모든 모델 클래스는 인스턴스화 시점에 추가 키워드 인수들(`temperature`, `max_tokens`, `top_p` 등)을 직접 전달하는 것을 지원합니다.\n이러한 매개변수들은 자동으로 기본 모델의 완성 호출로 전달되어, 창의성, 응답 길이, 샘플링 전략과 같은 모델 동작을 구성할 수 있게 해줍니다.\n\n## Google Gemini 모델 사용하기 [[using-google-gemini-models]]\n\nGoogle Gemini API 문서(https://ai.google.dev/gemini-api/docs/openai)에서 설명한 바와 같이,\nGoogle은 Gemini 모델에 대해 OpenAI 호환 API를 제공하므로, 적절한 베이스 URL을 설정하여\n[`OpenAIModel`]을 Gemini 모델과 함께 사용할 수 있습니다.\n\n먼저, 필요한 의존성을 설치합니다:\n```bash\npip install 'smolagents[openai]'\n```\n\n그다음, [Gemini API 키를 얻고](https://ai.google.dev/gemini-api/docs/api-key) 코드에서 설정합니다:\n```python\nGEMINI_API_KEY = <YOUR-GEMINI-API-KEY>\n```\n\n이제 `OpenAIModel` 클래스를 사용하고 `api_base` 매개변수를 Gemini API 베이스 URL로 설정하여\nGemini 모델을 초기화할 수 있습니다:\n```python\nfrom smolagents import OpenAIModel\n\nmodel = OpenAIModel(\n    model_id=\"gemini-2.0-flash\",\n    # Google Gemini OpenAI 호환 API 베이스 URL\n    api_base=\"https://generativelanguage.googleapis.com/v1beta/openai/\",\n    api_key=GEMINI_API_KEY,\n)\n```\n\n## OpenRouter 모델 사용하기 [[using-openrouter-models]]\n\nOpenRouter는 통합된 OpenAI 호환 API를 통해 다양한 언어 모델에 대한 접근을 제공합니다.\n적절한 베이스 URL을 설정하여 [`OpenAIModel`]을 사용해 OpenRouter에 연결할 수 있습니다.\n\n먼저, 필요한 의존성을 설치합니다:\n```bash\npip install 'smolagents[openai]'\n```\n\n그다음, [OpenRouter API 키를 얻고](https://openrouter.ai/keys) 코드에서 설정합니다:\n```python\nOPENROUTER_API_KEY = <YOUR-OPENROUTER-API-KEY>\n```\n\n이제 `OpenAIModel` 클래스를 사용하여 OpenRouter에서 사용 가능한 모든 모델을 초기화할 수 있습니다:\n```python\nfrom smolagents import OpenAIModel\n\nmodel = OpenAIModel(\n    # OpenRouter에서 사용 가능한 모든 모델 ID를 사용할 수 있습니다\n    model_id=\"openai/gpt-4o\",\n    # OpenRouter API 베이스 URL\n    api_base=\"https://openrouter.ai/api/v1\",\n    api_key=OPENROUTER_API_KEY,\n)\n```\n\n## xAI의 Grok 모델 사용하기 [[using-xais-grok-models]]\n\nxAI의 Grok 모델은 [`LiteLLMModel`]을 통해 접근할 수 있습니다.\n\n일부 모델(\"grok-4\" 및 \"grok-3-mini\" 등)은 `stop` 매개변수를 지원하지 않으므로,\nAPI 호출에서 이를 제외하기 위해 `REMOVE_PARAMETER`를 사용해야 합니다.\n\n먼저, 필요한 의존성을 설치합니다:\n```bash\npip install smolagents[litellm]\n```\n\n그다음, [xAI API 키를 얻고](https://console.x.ai/) 코드에서 설정합니다:\n```python\nXAI_API_KEY = <YOUR-XAI-API-KEY>\n```\n\n이제 `LiteLLMModel` 클래스를 사용하여 Grok 모델을 초기화하고 해당되는 경우 `stop` 매개변수를 제거할 수 있습니다:\n```python\nfrom smolagents import LiteLLMModel, REMOVE_PARAMETER\n\n# Grok-4 사용\nmodel = LiteLLMModel(\n    model_id=\"xai/grok-4\",\n    api_key=XAI_API_KEY,\n    stop=REMOVE_PARAMETER,  # grok-4 모델이 이를 지원하지 않으므로 stop 매개변수 제거\n    temperature=0.7\n)\n\n# 또는 Grok-3-mini 사용\nmodel_mini = LiteLLMModel(\n    model_id=\"xai/grok-3-mini\",\n    api_key=XAI_API_KEY,\n    stop=REMOVE_PARAMETER,  # grok-3-mini 모델이 이를 지원하지 않으므로 stop 매개변수 제거\n    max_tokens=1000\n)\n```\n"
  },
  {
    "path": "docs/source/ko/examples/web_browser.md",
    "content": "# 에이전트를 활용한 웹 브라우저 자동화 🤖🌐[[web-browser-automation-with-agents-🤖🌐]]\n\n[[open-in-colab]]\n\n이 노트북에서는 **에이전트 기반 웹 브라우저 자동화 시스템**을 구축해보겠습니다! 이 시스템은 웹사이트 탐색, 요소 상호작용, 정보 자동 추출이 가능합니다.\n\n에이전트는 다음과 같은 기능을 수행할 수 있습니다.\n\n- [x] 웹 페이지 탐색\n- [x] 요소 클릭\n- [x] 페이지 내 검색\n- [x] 팝업 및 모달 처리\n- [x] 정보 추출\n\n단계별로 이 시스템을 구축해보겠습니다!\n\n먼저 필요한 의존성을 설치하기 위해 다음을 실행하세요.\n\n```bash\npip install smolagents selenium helium pillow -q\n```\n\n필요한 라이브러리를 가져오고 환경 변수를 설정해보겠습니다.\n\n```python\nfrom io import BytesIO\nfrom time import sleep\n\nimport helium\nfrom dotenv import load_dotenv\nfrom PIL import Image\nfrom selenium import webdriver\nfrom selenium.webdriver.common.by import By\nfrom selenium.webdriver.common.keys import Keys\n\nfrom smolagents import CodeAgent, tool\nfrom smolagents.agents import ActionStep\n\n# 환경 변수를 불러옵니다.\nload_dotenv()\n```\n\n이제 에이전트가 웹 페이지를 탐색하고 상호작용할 수 있도록 하는 핵심 브라우저 상호작용 도구들을 만들어보겠습니다.\n\n```python\n@tool\ndef search_item_ctrl_f(text: str, nth_result: int = 1) -> str:\n    \"\"\"\n    현재 페이지에서 Ctrl + F를 사용해 지정된 텍스트를 검색하고, n번째로 등장하는 위치로 이동합니다.\n    인자:\n        text: 검색할 텍스트\n        nth_result: 이동할 n번째 검색 결과 (기본값: 1)\n    \"\"\"\n    elements = driver.find_elements(By.XPATH, f\"//*[contains(text(), '{text}')]\")\n    if nth_result > len(elements):\n        raise Exception(f\"Match n°{nth_result} not found (only {len(elements)} matches found)\")\n    result = f\"Found {len(elements)} matches for '{text}'.\"\n    elem = elements[nth_result - 1]\n    driver.execute_script(\"arguments[0].scrollIntoView(true);\", elem)\n    result += f\"Focused on element {nth_result} of {len(elements)}\"\n    return result\n\n@tool\ndef go_back() -> None:\n    \"\"\"이전 페이지로 돌아갑니다.\"\"\"\n    driver.back()\n\n@tool\ndef close_popups() -> str:\n    \"\"\"\n    Closes any visible modal or pop-up on the page. Use this to dismiss pop-up windows!\n    This does not work on cookie consent banners.\n    \"\"\"\n    webdriver.ActionChains(driver).send_keys(Keys.ESCAPE).perform()\n```\n\nChrome으로 브라우저를 설정하고 스크린샷 기능을 구성해보겠습니다.\n\n```python\n# Configure Chrome options\nchrome_options = webdriver.ChromeOptions()\nchrome_options.add_argument(\"--force-device-scale-factor=1\")\nchrome_options.add_argument(\"--window-size=1000,1350\")\nchrome_options.add_argument(\"--disable-pdf-viewer\")\nchrome_options.add_argument(\"--window-position=0,0\")\n\n# Initialize the browser\ndriver = helium.start_chrome(headless=False, options=chrome_options)\n\n# Set up screenshot callback\ndef save_screenshot(memory_step: ActionStep, agent: CodeAgent) -> None:\n    sleep(1.0)  # Let JavaScript animations happen before taking the screenshot\n    driver = helium.get_driver()\n    current_step = memory_step.step_number\n    if driver is not None:\n        for previous_memory_step in agent.memory.steps:  # Remove previous screenshots for lean processing\n            if isinstance(previous_memory_step, ActionStep) and previous_memory_step.step_number <= current_step - 2:\n                previous_memory_step.observations_images = None\n        png_bytes = driver.get_screenshot_as_png()\n        image = Image.open(BytesIO(png_bytes))\n        print(f\"Captured a browser screenshot: {image.size} pixels\")\n        memory_step.observations_images = [image.copy()]  # Create a copy to ensure it persists\n\n    # Update observations with current URL\n    url_info = f\"Current url: {driver.current_url}\"\n    memory_step.observations = (\n        url_info if memory_step.observations is None else memory_step.observations + \"\\n\" + url_info\n    )\n```\n\n이제 웹 자동화 에이전트를 만들어보겠습니다.\n\n```python\nfrom smolagents import InferenceClientModel\n\n# Initialize the model\nmodel_id = \"Qwen/Qwen2-VL-72B-Instruct\"  # You can change this to your preferred VLM model\nmodel = InferenceClientModel(model_id=model_id)\n\n# Create the agent\nagent = CodeAgent(\n    tools=[go_back, close_popups, search_item_ctrl_f],\n    model=model,\n    additional_authorized_imports=[\"helium\"],\n    step_callbacks=[save_screenshot],\n    max_steps=20,\n    verbosity_level=2,\n)\n\n# Import helium for the agent\nagent.python_executor(\"from helium import *\", agent.state)\n```\n\n에이전트가 웹 자동화를 위해 Helium을 사용하려면 지침이 필요합니다. 다음은 제공할 지침입니다.\n\n```python\nhelium_instructions = \"\"\"\nYou can use helium to access websites. Don't bother about the helium driver, it's already managed.\nWe've already ran \"from helium import *\"\nThen you can go to pages!\nCode:\n```py\ngo_to('github.com/trending')\n```<end_code>\n\nYou can directly click clickable elements by inputting the text that appears on them.\nCode:\n```py\nclick(\"Top products\")\n```<end_code>\n\nIf it's a link:\nCode:\n```py\nclick(Link(\"Top products\"))\n```<end_code>\n\nIf you try to interact with an element and it's not found, you'll get a LookupError.\nIn general stop your action after each button click to see what happens on your screenshot.\nNever try to login in a page.\n\nTo scroll up or down, use scroll_down or scroll_up with as an argument the number of pixels to scroll from.\nCode:\n```py\nscroll_down(num_pixels=1200) # This will scroll one viewport down\n```<end_code>\n\nWhen you have pop-ups with a cross icon to close, don't try to click the close icon by finding its element or targeting an 'X' element (this most often fails).\nJust use your built-in tool `close_popups` to close them:\nCode:\n```py\nclose_popups()\n```<end_code>\n\nYou can use .exists() to check for the existence of an element. For example:\nCode:\n```py\nif Text('Accept cookies?').exists():\n    click('I accept')\n```<end_code>\n\"\"\"\n```\n\n이제 작업과 함께 에이전트를 실행할 수 있습니다! Wikipedia에서 정보를 찾는 것을 시도해보겠습니다.\n\n```python\nsearch_request = \"\"\"\nPlease navigate to https://en.wikipedia.org/wiki/Chicago and give me a sentence containing the word \"1992\" that mentions a construction accident.\n\"\"\"\n\nagent_output = agent.run(search_request + helium_instructions)\nprint(\"Final output:\")\nprint(agent_output)\n```\n\n요청을 수정하여 다른 작업을 실행할 수 있습니다. 예를 들어, 제가 얼마나 열심히 일해야 하는지 알아보는 작업입니다.\n\n```python\ngithub_request = \"\"\"\nI'm trying to find how hard I have to work to get a repo in github.com/trending.\nCan you navigate to the profile for the top author of the top trending repo, and give me their total number of commits over the last year?\n\"\"\"\n\nagent_output = agent.run(github_request + helium_instructions)\nprint(\"Final output:\")\nprint(agent_output)\n```\n\n이 시스템은 특히 다음과 같은 작업에 효과적입니다.\n- 웹사이트에서 데이터 추출\n- 웹 리서치 자동화\n- UI 테스트 및 검증\n- 콘텐츠 모니터링\n"
  },
  {
    "path": "docs/source/ko/guided_tour.md",
    "content": "# 에이전트 안내서[[agents---guided-tour]]\n\n[[open-in-colab]]\n\n이 안내서에서는 에이전트를 구축하는 방법, 실행하는 방법, 그리고 사용 사례에 맞게 더 잘 작동하도록 맞춤 설정하는 방법을 학습합니다.\n\n## 에이전트 유형 선택: CodeAgent 또는 ToolCallingAgent[[choosing-an-agent-type:-codeagent-or-toolcallingagent]]\n\n`smolagents`는 [`CodeAgent`]와 [`ToolCallingAgent`] 두 가지 에이전트 클래스를 제공하는데, 이 두 클래스는 각각 에이전트가 도구와 상호작용하는 방법이 다릅니다.\n두 방식의 핵심 차이점은 '액션을 지정하고 실행'하는 방식에 있습니다: 코드 생성 vs 구조화된 도구 호출.\n\n- [`CodeAgent`]는 도구 호출을 Python 코드 스니펫으로 생성합니다.\n  - 코드는 로컬에서 실행되거나(잠재적으로 불안전) 보안 샌드박스에서 실행됩니다.\n  - 도구는 Python 함수로 노출됩니다(바인딩을 통해).\n  - 도구 호출 예시:\n    ```py\n    result = search_docs(\"What is the capital of France?\")\n    print(result)\n    ```\n  - 장점:\n    - 높은 표현력: 복잡한 로직과 제어 흐름을 허용하고 도구를 결합하고, 반복하고, 변환하고, 추론할 수 있습니다.\n    - 유연성: 모든 가능한 액션을 미리 정의할 필요가 없으며, 동적으로 새로운 액션/도구를 생성할 수 있습니다.\n    - 창발적 추론: 다단계 문제나 동적 로직에 이상적입니다.\n  - 제한사항\n    - 오류 위험: 구문 오류, 예외를 처리해야 합니다.\n    - 예측성 부족: 예상치 못한 또는 안전하지 않은 출력에 더 취약합니다.\n    - 보안 실행 환경이 필요합니다.\n\n- [`ToolCallingAgent`]는 도구 호출을 구조화된 JSON으로 작성합니다.\n  - 이는 많은 프레임워크(OpenAI API)에서 사용되는 일반적인 형식으로, 코드 실행 없이 구조화된 도구 상호작용을 가능하게 합니다.\n  - 도구는 JSON 스키마로 정의됩니다: 이름, 설명, 매개변수 타입 등.\n  - 도구 호출 예시:\n    ```json\n    {\n      \"tool_call\": {\n        \"name\": \"search_docs\",\n        \"arguments\": {\n          \"query\": \"What is the capital of France?\"\n        }\n      }\n    }\n    ```\n  - 장점:\n    - 안정성: 환각이 적고, 출력이 구조화되고 검증됩니다.\n    - 안전성: 인수가 엄격하게 검증되고, 임의의 코드가 실행될 위험이 없습니다.\n    - 상호 운용성: 외부 API나 서비스에 쉽게 매핑됩니다.\n  - 제한사항:\n    - 낮은 표현력: 결과를 동적으로 쉽게 결합하거나 변환할 수 없고, 복잡한 로직이나 제어 흐름을 수행할 수 없습니다.\n    - 유연성 부족: 모든 가능한 액션을 미리 정의해야 하고, 사전 정의된 도구로 제한됩니다.\n    - 코드 합성 없음: 도구 기능으로 제한됩니다.\n\n어떤 에이전트 유형을 사용할지:\n- [`CodeAgent`]를 사용하는 경우:\n  - 추론, 연결 또는 동적 구성이 필요한 경우.\n  - 도구가 결합할 수 있는 함수인 경우(예: 구문 분석 + 수학 + 쿼리).\n  - 에이전트가 문제 해결자 또는 프로그래머인 경우.\n\n- [`ToolCallingAgent`]를 사용하는 경우:\n  - 단순하고 독립적인 도구가 있는 경우(예: API 호출, 문서 가져오기).\n  - 높은 안정성과 명확한 검증을 원하는 경우.\n  - 에이전트가 디스패처나 컨트롤러 같은 역할인 경우.\n\n## CodeAgent[[codeagent]]\n\n[`CodeAgent`]는 액션을 수행하고 작업을 해결하기 위해 Python 코드 스니펫을 생성합니다.\n\n기본적으로 Python 코드 실행은 로컬 환경에서 수행됩니다.\n사용자가 제공한 도구들(특히 Hugging Face 도구만 있는 경우)과 `print`나 `math` 모듈 함수 같은 사전 정의된 안전한 함수들만 호출할 수 있도록 제한되어 있어 안전합니다.\n\nPython 인터프리터는 기본적으로 안전 목록에 포함된 모듈만 import를 허용하므로, 대부분의 명백한 보안 공격을 방지할 수 있습니다.\n[`CodeAgent`]를 초기화할 때 `additional_authorized_imports` 인수에 문자열 목록으로 승인된 모듈을 전달하여 추가 import를 승인할 수 있습니다:\n\n```py\nmodel = InferenceClientModel()\nagent = CodeAgent(tools=[], model=model, additional_authorized_imports=['requests', 'bs4'])\nagent.run(\"Could you get me the title of the page at url 'https://huggingface.co/blog'?\")\n```\n\n또한 추가 보안 계층으로, import 목록에서 명시적으로 승인되지 않는 한 서브모듈에 대한 접근은 기본적으로 금지됩니다.\n예를 들어, `numpy.random` 서브모듈에 접근하려면 `additional_authorized_imports` 목록에 `'numpy.random'`을 추가해야 합니다.\n이는 `numpy`와 `numpy.random` 같은 모든 서브패키지 및 자체 서브패키지를 허용하는 `numpy.*`를 사용하여 승인할 수도 있습니다.\n\n> [!WARNING]\n> LLM은 실행될 임의의 코드를 생성할 수 있습니다: 안전하지 않은 import는 추가하지 마세요!\n\n불법적인 작업을 수행하려고 시도하는 코드나 에이전트가 생성한 코드에 일반적인 Python 오류가 있는 경우 실행이 중단됩니다.\n\n로컬 Python 인터프리터 대신 [E2B code executor](https://e2b.dev/docs#what-is-e2-b)나 Docker를 사용할 수도 있습니다. E2B의 경우, 먼저 [`E2B_API_KEY` 환경 변수를 설정](https://e2b.dev/dashboard?tab=keys)한 다음 에이전트 초기화 시 `executor_type=\"e2b\"`를 전달하세요. Docker의 경우, 초기화 중에 `executor_type=\"docker\"`를 전달하세요.\n\n> [!TIP]\n> 코드 실행에 대해 더 자세히 알아보려면 [이 튜토리얼](tutorials/secure_code_execution)을 확인하세요.\n\n### ToolCallingAgent[[toolcallingagent]]\n\n[`ToolCallingAgent`]는 많은 프레임워크(OpenAI API)에서 사용되는 일반적인 형식인 JSON 도구 호출을 출력하여, 코드 실행 없이 구조화된 도구 상호작용을 가능하게 합니다.\n\n코드를 실행하지 않으므로 `additional_authorized_imports` 없이도 [`CodeAgent`]와 거의 동일한 방식으로 작동합니다:\n\n```py\nfrom smolagents import ToolCallingAgent, WebSearchTool\n\nagent = ToolCallingAgent(tools=[WebSearchTool()], model=model)\nagent.run(\"Could you get me the title of the page at url 'https://huggingface.co/blog'?\")\n```\n\n## 에이전트 구축[[building-your-agent]]\n\n최소한의 에이전트를 초기화하려면 최소한 다음 두 인수가 필요합니다:\n\n- `model`, 에이전트를 구동하는 텍스트 생성 모델 - 에이전트는 단순한 LLM과 다르며, LLM을 엔진으로 사용하는 시스템입니다. 다음 옵션 중 하나를 사용할 수 있습니다:\n    - [`TransformersModel`]은 사전 초기화된 `transformers` 파이프라인을 가져와 `transformers`를 사용하여 로컬 머신에서 추론을 실행합니다.\n    - [`InferenceClientModel`]은 내부적으로 `huggingface_hub.InferenceClient`를 활용하며 Hub의 모든 추론 제공자를 지원합니다: Cerebras, Cohere, Fal, Fireworks, HF-Inference, Hyperbolic, Nebius, Novita, Replicate, SambaNova, Together 등.\n    - [`LiteLLMModel`]은 마찬가지로 [LiteLLM](https://docs.litellm.ai/)을 통해 100개 이상의 다양한 모델과 제공자를 호출할 수 있습니다!\n    - [`AzureOpenAIModel`]은 [Azure](https://azure.microsoft.com/en-us/products/ai-services/openai-service)에 배포된 OpenAI 모델을 사용할 수 있게 해줍니다.\n    - [`AmazonBedrockModel`]은 [AWS](https://aws.amazon.com/bedrock/?nc1=h_ls)의 Amazon Bedrock을 사용할 수 있게 해줍니다.\n    - [`MLXModel`]은 로컬 머신에서 추론을 실행하기 위한 [mlx-lm](https://pypi.org/project/mlx-lm/) 파이프라인을 생성합니다.\n\n- `tools`, 에이전트가 작업 해결에 사용할 수 있는 도구 목록입니다. 빈 목록으로 설정할 수도 있습니다. add_base_tools=True 옵션을 사용하면 기본 제공되는 도구들(웹 검색, 코드 실행, 음성 인식 등)을 `tools` 목록에 추가할 수 있습니다.\n\n`tools`와 `model` 두 인수를 설정하면 에이전트를 생성하고 실행할 수 있습니다. [추론 제공자](https://huggingface.co/blog/inference-providers), [transformers](https://github.com/huggingface/transformers/), [ollama](https://ollama.com/), [LiteLLM](https://www.litellm.ai/), [Azure OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service), [Amazon Bedrock](https://aws.amazon.com/bedrock/?nc1=h_ls), 또는 [mlx-lm](https://pypi.org/project/mlx-lm/)을 통해 원하는 LLM을 사용할 수 있습니다.\n\n모든 모델 클래스는 인스턴스화 시점에 추가 키워드 인수(예: `temperature`, `max_tokens`, `top_p` 등)를 직접 전달하는 것을 지원합니다.\n이러한 매개변수는 기본 모델의 완성 호출에 자동으로 전달되어 창의성, 응답 길이, 샘플링 전략 등의 모델 동작을 구성할 수 있습니다.\n\n<hfoptions id=\"Pick a LLM\">\n<hfoption id=\"Inference Providers\">\n\n추론 제공자는 인증을 위해 `HF_TOKEN`이 필요하지만, 무료 HF 계정에는 이미 포함된 크레딧이 제공됩니다. PRO로 업그레이드하여 포함된 크레딧을 늘리세요.\n\n제한된 모델에 접근하거나 PRO 계정으로 속도 제한을 높이려면 환경 변수 `HF_TOKEN`을 설정하거나 `InferenceClientModel` 초기화 시 `token` 변수를 전달해야 합니다. [설정 페이지](https://huggingface.co/settings/tokens)에서 토큰을 얻을 수 있습니다.\n\n```python\nfrom smolagents import CodeAgent, InferenceClientModel\n\nmodel_id = \"meta-llama/Llama-3.3-70B-Instruct\"\n\nmodel = InferenceClientModel(model_id=model_id, token=\"<YOUR_HUGGINGFACEHUB_API_TOKEN>\") # You can choose to not pass any model_id to InferenceClientModel to use a default model\n# you can also specify a particular provider e.g. provider=\"together\" or provider=\"sambanova\"\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n</hfoption>\n<hfoption id=\"Local Transformers Model\">\n\n```python\n# !pip install smolagents[transformers]\nfrom smolagents import CodeAgent, TransformersModel\n\nmodel_id = \"meta-llama/Llama-3.2-3B-Instruct\"\n\nmodel = TransformersModel(model_id=model_id)\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n</hfoption>\n<hfoption id=\"OpenAI or Anthropic API\">\n\n`LiteLLMModel`을 사용하려면 환경 변수 `ANTHROPIC_API_KEY` 또는 `OPENAI_API_KEY`를 설정하거나 초기화 시 `api_key` 변수를 전달해야 합니다.\n\n```python\n# !pip install smolagents[litellm]\nfrom smolagents import CodeAgent, LiteLLMModel\n\nmodel = LiteLLMModel(model_id=\"anthropic/claude-3-5-sonnet-latest\", api_key=\"YOUR_ANTHROPIC_API_KEY\") # Could use 'gpt-4o'\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n</hfoption>\n<hfoption id=\"Ollama\">\n\n```python\n# !pip install smolagents[litellm]\nfrom smolagents import CodeAgent, LiteLLMModel\n\nmodel = LiteLLMModel(\n    model_id=\"ollama_chat/llama3.2\", # This model is a bit weak for agentic behaviours though\n    api_base=\"http://localhost:11434\", # replace with 127.0.0.1:11434 or remote open-ai compatible server if necessary\n    api_key=\"YOUR_API_KEY\", # replace with API key if necessary\n    num_ctx=8192, # ollama default is 2048 which will fail horribly. 8192 works for easy tasks, more is better. Check https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator to calculate how much VRAM this will need for the selected model.\n)\n\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n</hfoption>\n<hfoption id=\"Azure OpenAI\">\n\nAzure OpenAI에 연결하려면 `AzureOpenAIModel`을 직접 사용하거나 `LiteLLMModel`을 사용하여 적절히 구성할 수 있습니다.\n\n`AzureOpenAIModel`의 인스턴스를 초기화하려면 모델 배포 이름을 전달한 다음 `azure_endpoint`, `api_key`, `api_version` 인수를 전달하거나 환경 변수 `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_API_KEY`, `OPENAI_API_VERSION`을 설정해야 합니다.\n\n```python\n# !pip install smolagents[openai]\nfrom smolagents import CodeAgent, AzureOpenAIModel\n\nmodel = AzureOpenAIModel(model_id=\"gpt-4o-mini\")\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n\n마찬가지로 다음과 같이 `LiteLLMModel`을 구성하여 Azure OpenAI에 연결할 수 있습니다:\n\n- 모델 배포 이름을 `model_id`로 전달하고, 앞에 `azure/`를 붙여야 합니다.\n- 환경 변수 `AZURE_API_VERSION`을 설정해야 합니다.\n- `api_base`와 `api_key` 인수를 전달하거나 환경 변수 `AZURE_API_KEY`, `AZURE_API_BASE`를 설정합니다.\n\n```python\nimport os\nfrom smolagents import CodeAgent, LiteLLMModel\n\nAZURE_OPENAI_CHAT_DEPLOYMENT_NAME=\"gpt-35-turbo-16k-deployment\" # example of deployment name\n\nos.environ[\"AZURE_API_KEY\"] = \"\" # api_key\nos.environ[\"AZURE_API_BASE\"] = \"\" # \"https://example-endpoint.openai.azure.com\"\nos.environ[\"AZURE_API_VERSION\"] = \"\" # \"2024-10-01-preview\"\n\nmodel = LiteLLMModel(model_id=\"azure/\" + AZURE_OPENAI_CHAT_DEPLOYMENT_NAME)\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n   \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n\n</hfoption>\n<hfoption id=\"Amazon Bedrock\">\n\n`AmazonBedrockModel` 클래스는 Amazon Bedrock과 직접 연동되어 API 호출과 세부 구성을 지원합니다.\n\n기본 사용법:\n\n```python\n# !pip install smolagents[aws_sdk]\nfrom smolagents import CodeAgent, AmazonBedrockModel\n\nmodel = AmazonBedrockModel(model_id=\"anthropic.claude-3-sonnet-20240229-v1:0\")\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n\n고급 구성:\n\n```python\nimport boto3\nfrom smolagents import AmazonBedrockModel\n\n# Create a custom Bedrock client\nbedrock_client = boto3.client(\n    'bedrock-runtime',\n    region_name='us-east-1',\n    aws_access_key_id='YOUR_ACCESS_KEY',\n    aws_secret_access_key='YOUR_SECRET_KEY'\n)\n\nadditional_api_config = {\n    \"inferenceConfig\": {\n        \"maxTokens\": 3000\n    },\n    \"guardrailConfig\": {\n        \"guardrailIdentifier\": \"identify1\",\n        \"guardrailVersion\": 'v1'\n    },\n}\n\n# Initialize with comprehensive configuration\nmodel = AmazonBedrockModel(\n    model_id=\"us.amazon.nova-pro-v1:0\",\n    client=bedrock_client,  # Use custom client\n    **additional_api_config\n)\n\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n\nLiteLLMModel 사용:\n\n또는 Bedrock 모델과 함께 `LiteLLMModel`을 사용할 수 있습니다:\n\n```python\nfrom smolagents import LiteLLMModel, CodeAgent\n\nmodel = LiteLLMModel(model_name=\"bedrock/anthropic.claude-3-sonnet-20240229-v1:0\")\nagent = CodeAgent(tools=[], model=model)\n\nagent.run(\"Explain the concept of quantum computing\")\n```\n\n</hfoption>\n<hfoption id=\"mlx-lm\">\n\n```python\n# !pip install smolagents[mlx-lm]\nfrom smolagents import CodeAgent, MLXModel\n\nmlx_model = MLXModel(\"mlx-community/Qwen2.5-Coder-32B-Instruct-4bit\")\nagent = CodeAgent(model=mlx_model, tools=[], add_base_tools=True)\n\nagent.run(\"Could you give me the 118th number in the Fibonacci sequence?\")\n```\n\n</hfoption>\n</hfoptions>\n\n## 고급 에이전트 구성[[advanced-agent-configuration]]\n\n### 에이전트 종료 조건 맞춤 설정[[customizing-agent-termination-conditions]]\n\n기본적으로 에이전트는 `final_answer` 함수를 호출하거나 최대 단계 수에 도달할 때까지 계속 실행됩니다.\n`final_answer_checks` 매개변수는 에이전트가 실행을 종료하는 시점과 방법을 더 세밀하게 제어할 수 있게 해줍니다:\n\n```python\nfrom smolagents import CodeAgent, InferenceClientModel\n\n# Define a custom final answer check function\ndef is_integer(final_answer: str, agent_memory=None) -> bool:\n    \"\"\"Return True if final_answer is an integer.\"\"\"\n    try:\n        int(final_answer)\n        return True\n    except ValueError:\n        return False\n\n# Initialize agent with custom final answer check\nagent = CodeAgent(\n    tools=[],\n    model=InferenceClientModel(),\n    final_answer_checks=[is_integer]\n)\n\nagent.run(\"Calculate the least common multiple of 3 and 7\")\n```\n\n`final_answer_checks` 매개변수는 각각 다음과 같은 함수들의 목록을 받습니다:\n- 에이전트의 final_answer 문자열과 에이전트의 메모리를 매개변수로 받습니다\n- final_answer가 유효한지(True) 아닌지(False)를 나타내는 불리언을 반환합니다\n\n함수 중 하나라도 `False`를 반환하면 에이전트는 오류 메시지를 로그에 기록하고 실행을 계속합니다.\n이 검증 메커니즘은 다음을 가능하게 합니다:\n- 출력 형식 요구사항 강제(예: 수학 문제에 대한 숫자 답변 보장)\n- 도메인별 검증 규칙 구현\n- 자체 출력을 검증하는 더 견고한 에이전트 생성\n\n## 에이전트 실행 검사[[inspecting-an-agent-run]]\n\n실행 후 무슨 일이 일어났는지 확인하는 데 유용한 몇 가지 속성이 있습니다:\n\n- `agent.logs`는 에이전트의 상세한 실행 로그를 저장합니다. 에이전트 실행의 각 단계마다 모든 정보가 딕셔너리 형태로 저장되어 `agent.logs`에 추가됩니다.\n- `agent.write_memory_to_messages()`는 에이전트의 메모리를 모델이 볼 수 있는 채팅 메시지 목록으로 변환합니다. 이 메소드는 로그의 각 단계를 살펴보고 중요한 내용만 메시지로 저장합니다. 예를 들어, 시스템 프롬프트와 작업을 각각 별도 메시지로 저장하고, 각 단계의 LLM 출력과 도구 호출 결과를 개별 메시지로 저장합니다. 전체적인 흐름 파악이 필요할 때 권장드립니다. 단, 모든 로그가 이 메소드를 통해 기록되는 것은 아닙니다.\n\n## 도구[[tools]]\n\n도구는 에이전트가 사용할 수 있는 독립적인 함수입니다. LLM이 도구를 사용하기 위해서는 먼저 API를 구성해야하며, 또한 LLM에게 해당 도구 호출하는 방법을 설명해주어야합니다 :\n- 이름\n- 설명\n- 입력 타입과 설명\n- 출력 타입\n\n예를 들어 [`PythonInterpreterTool`]을 확인할 수 있습니다: 이름, 설명, 입력 설명, 출력 타입, 그리고 액션을 수행하는 `forward` 메소드가 있습니다.\n\n에이전트가 초기화될 때 도구 속성이 에이전트의 시스템 프롬프트에 포함되는 도구 설명을 생성하는 데 사용됩니다. 이를 통해 에이전트는 사용할 수 있는 도구와 그 이유를 알 수 있습니다.\n\n**스키마 정보**: `output_schema`가 정의된 도구(구조화된 출력을 가진 MCP 도구 등)의 경우, `CodeAgent` 시스템 프롬프트에 자동으로 JSON 스키마 정보가 포함됩니다. 이는 에이전트가 도구 출력의 예상 구조를 이해하고 데이터에 적절히 접근할 수 있도록 도와줍니다.\n\n### 기본 툴박스[[default-toolbox]]\n\n\"toolkit\" extra와 함께 `smolagents`를 설치하면 에이전트를 강화하는 기본 툴박스가 함께 제공되며, `add_base_tools=True` 인수로 초기화 시 에이전트에 추가할 수 있습니다:\n\n- **DuckDuckGo 웹 검색***: DuckDuckGo 브라우저를 사용하여 웹 검색을 수행합니다.\n- **Python 코드 인터프리터**: 보안 환경에서 LLM이 생성한 Python 코드를 실행합니다. 이 도구는 코드 기반 에이전트가 이미 기본적으로 Python 코드를 실행할 수 있으므로 `add_base_tools=True`로 초기화할 때만 [`ToolCallingAgent`]에 추가됩니다.\n- **Transcriber**: 오디오를 텍스트로 변환하는 Whisper-Turbo 기반의 음성-텍스트 파이프라인입니다.\n\n인수와 함께 호출하여 도구를 수동으로 사용할 수 있습니다.\n\n```python\n# !pip install smolagents[toolkit]\nfrom smolagents import WebSearchTool\n\nsearch_tool = WebSearchTool()\nprint(search_tool(\"Who's the current president of Russia?\"))\n```\n\n### 새로운 도구 생성[[create-a-new-tool]]\n\nHugging Face의 기본 도구가 다루지 않는 사용 사례를 위해 자신만의 도구를 만들 수 있습니다.\n예를 들어, Hub에서 주어진 작업에 대해 가장 많이 다운로드된 모델을 반환하는 도구를 만들어보겠습니다.\n\n아래 코드부터 시작하겠습니다.\n\n```python\nfrom huggingface_hub import list_models\n\ntask = \"text-classification\"\n\nmost_downloaded_model = next(iter(list_models(filter=task, sort=\"downloads\", direction=-1)))\nprint(most_downloaded_model.id)\n```\n\n이 코드는 함수로 만들고 `tool` 데코레이터를 추가하여 간단히 도구로 변환할 수 있습니다.\n하지만 이것이 도구를 만드는 유일한 방법은 아닙니다. [Tool]의 하위 클래스로 직접 정의하는 방법도 있으며, 이 방식은 더 많은 유연성을 제공합니다. 예를 들어 리소스 집약적인 클래스 속성을 초기화할 때 유용합니다.\n\n두 옵션 모두에서 어떻게 작동하는지 살펴보겠습니다:\n\n<hfoptions id=\"build-a-tool\">\n<hfoption id=\"Decorate a function with @tool\">\n\n```py\nfrom smolagents import tool\n\n@tool\ndef model_download_tool(task: str) -> str:\n    \"\"\"\n    This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub.\n    It returns the name of the checkpoint.\n\n    Args:\n        task: The task for which to get the download count.\n    \"\"\"\n    most_downloaded_model = next(iter(list_models(filter=task, sort=\"downloads\", direction=-1)))\n    return most_downloaded_model.id\n```\n\n함수에는 다음이 필요합니다:\n- 명확한 이름. 이름은 에이전트를 구동하는 LLM이 이해할 수 있도록 이 도구가 무엇을 하는지 충분히 설명적이어야 합니다. 이 도구는 작업에 대해 가장 많이 다운로드된 모델을 반환하므로 `model_download_tool`이라고 명명하겠습니다.\n- 입력과 출력 모두에 대한 타입 힌트\n- 각 인수가 설명되는 'Args:' 부분을 포함하는 설명(이번에는 타입 표시 없이, 타입 힌트에서 가져옵니다). 도구 이름과 마찬가지로, 이 설명은 에이전트를 구동하는 LLM을 위한 설명서이므로 소홀히 하지 마세요.\n\n이 모든 요소는 초기화 시 에이전트의 시스템 프롬프트에 자동으로 포함됩니다: 따라서 최대한 명확하게 만들도록 노력하세요!\n\n> [!TIP]\n> 이 정의 형식은 `apply_chat_template`에서 사용되는 도구 스키마와 동일하며, 유일한 차이점은 추가된 `tool` 데코레이터입니다: 도구 사용 API에 대해 더 자세히 알아보려면 [여기](https://huggingface.co/blog/unified-tool-use#passing-tools-to-a-chat-template)를 읽어보세요.\n\n\n그런 다음 에이전트를 직접 초기화할 수 있습니다:\n```py\nfrom smolagents import CodeAgent, InferenceClientModel\nagent = CodeAgent(tools=[model_download_tool], model=InferenceClientModel())\nagent.run(\n    \"Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub?\"\n)\n```\n</hfoption>\n<hfoption id=\"Subclass Tool\">\n\n```py\nfrom smolagents import Tool\n\nclass ModelDownloadTool(Tool):\n    name = \"model_download_tool\"\n    description = \"This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub. It returns the name of the checkpoint.\"\n    inputs = {\"task\": {\"type\": \"string\", \"description\": \"The task for which to get the download count.\"}}\n    output_type = \"string\"\n\n    def forward(self, task: str) -> str:\n        most_downloaded_model = next(iter(list_models(filter=task, sort=\"downloads\", direction=-1)))\n        return most_downloaded_model.id\n```\n\n하위 클래스에는 다음 속성이 필요합니다:\n- 명확한 `name` (이름). 에이전트를 구동하는 LLM이 도구의 기능을 이해할 수 있도록 이름에 대해 충분히 설명해야 합니다. 이 도구는 작업에 대해 가장 많이 다운로드된 모델을 반환하므로 `model_download_tool`이라고 명명하겠습니다.\n- `description`. `name`과 마찬가지로, 이 설명은 에이전트를 구동하는 LLM을 위한 설명서이므로 소홀히 하지 마세요.\n- 입력 타입과 설명\n- 출력 타입\n이 모든 속성은 초기화 시 에이전트의 시스템 프롬프트에 자동으로 포함됩니다: 따라서 최대한 명확하게 만들도록 노력하세요!\n\n\n그런 다음 에이전트를 직접 초기화할 수 있습니다:\n```py\nfrom smolagents import CodeAgent, InferenceClientModel\nagent = CodeAgent(tools=[ModelDownloadTool()], model=InferenceClientModel())\nagent.run(\n    \"Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub?\"\n)\n```\n</hfoption>\n</hfoptions>\n\n다음 로그를 얻습니다:\n```text\n╭──────────────────────────────────────── New run ─────────────────────────────────────────╮\n│                                                                                          │\n│ Can you give me the name of the model that has the most downloads in the 'text-to-video' │\n│ task on the Hugging Face Hub?                                                            │\n│                                                                                          │\n╰─ InferenceClientModel - Qwen/Qwen2.5-Coder-32B-Instruct ───────────────────────────────────────────╯\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n╭─ Executing this code: ───────────────────────────────────────────────────────────────────╮\n│   1 model_name = model_download_tool(task=\"text-to-video\")                               │\n│   2 print(model_name)                                                                    │\n╰──────────────────────────────────────────────────────────────────────────────────────────╯\nExecution logs:\nByteDance/AnimateDiff-Lightning\n\nOut: None\n[Step 0: Duration 0.27 seconds| Input tokens: 2,069 | Output tokens: 60]\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n╭─ Executing this code: ───────────────────────────────────────────────────────────────────╮\n│   1 final_answer(\"ByteDance/AnimateDiff-Lightning\")                                      │\n╰──────────────────────────────────────────────────────────────────────────────────────────╯\nOut - Final answer: ByteDance/AnimateDiff-Lightning\n[Step 1: Duration 0.10 seconds| Input tokens: 4,288 | Output tokens: 148]\nOut[20]: 'ByteDance/AnimateDiff-Lightning'\n```\n\n> [!TIP]\n> 도구에 대해 더 자세히 알아보려면 [전용 튜토리얼](./tutorials/tools#what-is-a-tool-and-how-to-build-one)을 읽어보세요.\n\n## 멀티 에이전트[[multi-agents]]\n\n멀티 에이전트 시스템은 Microsoft의 프레임워크 [Autogen](https://huggingface.co/papers/2308.08155)과 함께 도입되었습니다.\n\n이러한 프레임워크에서는 단일 에이전트 대신 여러 에이전트가 협력하여 작업을 해결합니다.\n실제로 대부분의 벤치마크에서 더 우수한 성능을 보여줍니다. 성능이 향상되는 이유는 개념적으로 단순합니다. 많은 작업에서 모든 기능을 담당하는 범용 시스템보다는 특정 하위 작업에 특화된 전문 단위를 사용하는 것이 더 효과적이기 때문입니다. 서로 다른 도구와 메모리를 가진 에이전트들을 활용하면 효율적인 역할 분담이 가능합니다. 예를 들어, 웹 검색 에이전트가 수집한 모든 웹페이지 내용을 코드 생성 에이전트의 메모리에까지 저장할 필요가 있을까요? 각자의 역할에 맞게 분리해서 운영하는 것이 훨씬 효율적입니다.\n\n`smolagents`로 계층적 멀티 에이전트 시스템을 쉽게 구축할 수 있습니다.\n\n이를 위해서는 에이전트에 `name`과 `description` 속성만 있으면 되며, 이는 도구와 마찬가지로 관리자 에이전트의 시스템 프롬프트에 포함되어 관리되는 에이전트를 호출하는 방법을 알려줍니다.\n그런 다음 관리자 에이전트를 초기화할 때 `managed_agents` 매개변수에 이 관리되는 에이전트를 전달할 수 있습니다.\n\n다음은 네이티브 [`WebSearchTool`]을 사용하여 특정 웹 검색 에이전트를 관리하는 에이전트를 만드는 예시입니다:\n\n```py\nfrom smolagents import CodeAgent, InferenceClientModel, WebSearchTool\n\nmodel = InferenceClientModel()\n\nweb_agent = CodeAgent(\n    tools=[WebSearchTool()],\n    model=model,\n    name=\"web_search_agent\",\n    description=\"Runs web searches for you. Give it your query as an argument.\"\n)\n\nmanager_agent = CodeAgent(\n    tools=[], model=model, managed_agents=[web_agent]\n)\n\nmanager_agent.run(\"Who is the CEO of Hugging Face?\")\n```\n\n> [!TIP]\n> 효율적인 멀티 에이전트 구현의 심화 예제를 보려면 [멀티 에이전트 시스템을 GAIA 리더보드 상위권으로 끌어올린 방법](https://huggingface.co/blog/beating-gaia)을 확인하세요.\n\n## 에이전트와 대화하고 멋진 Gradio 인터페이스에서 그 사고 과정을 시각화하기[[talk-with-your-agent-and-visualize-its-thoughts-in-a-cool-gradio-interface]]\n\n`GradioUI`를 사용하여 에이전트에 대화형으로 작업을 제출하고 그 사고와 실행 과정을 관찰할 수 있습니다. 다음은 예시입니다:\n\n```py\nfrom smolagents import (\n    load_tool,\n    CodeAgent,\n    InferenceClientModel,\n    GradioUI\n)\n\n# Import tool from Hub\nimage_generation_tool = load_tool(\"m-ric/text-to-image\", trust_remote_code=True)\n\nmodel = InferenceClientModel(model_id=model_id)\n\n# Initialize the agent with the image generation tool\nagent = CodeAgent(tools=[image_generation_tool], model=model)\n\nGradioUI(agent).launch()\n```\n\n내부적으로 사용자가 새로운 요청을 입력하면 에이전트는 `agent.run(user_request, reset=False)`로 실행됩니다.\n`reset=False` 플래그는 이 새로운 작업을 실행하기 전에 에이전트의 메모리가 플러시되지 않음을 의미하며, 이를 통해 대화가 계속될 수 있습니다.\n\n다른 에이전트 애플리케이션에서도 이 `reset=False` 인수를 사용하여 대화를 계속할 수 있습니다.\n\nGradio UI에서 사용자가 실행 중인 에이전트를 중단할 수 있도록 하려면 `agent.interrupt()` 메소드를 트리거하는 버튼으로 이를 수행할 수 있습니다.\n이렇게 하면 현재 단계가 끝날 때 에이전트가 중지되고 오류가 발생합니다.\n\n## 다음 단계[[next-steps]]\n\n마지막으로 에이전트를 필요에 맞게 구성했다면 Hub에 공유할 수 있습니다!\n\n```py\nagent.push_to_hub(\"m-ric/my_agent\")\n```\n\n마찬가지로, 도구의 코드를 신뢰한다면 Hub에 업로드된 에이전트를 불러오려면 다음을 사용하세요:\n```py\nagent.from_hub(\"m-ric/my_agent\", trust_remote_code=True)\n```\n\n더 자세한 활용법을 원한다면 다음 튜토리얼들을 참고하세요:\n- [코드 에이전트가 작동하는 방법에 대한 설명](./tutorials/secure_code_execution)\n- [좋은 에이전트를 구축하는 방법에 대한 가이드](./tutorials/building_good_agents).\n- [도구 사용에 대한 상세 가이드](./tutorials/building_good_agents).\n"
  },
  {
    "path": "docs/source/ko/index.md",
    "content": "# `smolagents`[[smolagents]]\n\n<div class=\"flex justify-center\">\n    <img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/license_to_call.png\" style=\"max-width:700px\"/>\n</div>\n\n## smolagents란 무엇인가요?[[what-is-smolagents]]\n\n`smolagents`는 단 몇 줄의 코드만으로 에이전트를 구축하고 실행할 수 있도록 설계된 오픈소스 Python 라이브러리입니다.\n\n`smolagents`의 주요 특징:\n\n✨ **단순함**: 에이전트 로직이 약 천 줄의 코드로 구현되어 있습니다. 코드 위에 불필요한 복잡한 구조를 추가하지 않고 단순하게 만들었습니다!\n\n🧑‍💻 **코드 에이전트의 완전한 지원**: [`CodeAgent`](reference/agents#smolagents.CodeAgent)는 도구 호출이나 계산 수행을 위해 직접 코드를 작성합니다 (\"코드 작성용 에이전트\"와는 반대 개념). 이를 통해 함수 중첩, 루프, 조건문 등을 자연스럽게 조합할 수 있습니다. 보안을 위해 [E2B](https://e2b.dev/)나 Docker를 통한 [샌드박스 환경 실행](tutorials/secure_code_execution)을 지원합니다.\n\n📡 **기본 도구 호출 에이전트 지원**: CodeAgent 외에도 [`ToolCallingAgent`](reference/agents#smolagents.ToolCallingAgent)는 일반적인 JSON/텍스트 기반 도구 호출 방식이 필요한 경우를 위해 지원됩니다.\n\n🤗 **Hub 통합**: Gradio Spaces로 에이전트와 도구를 Hub에서 원활하게 공유하고 로드할 수 있습니다.\n\n🌐 **모델 독립적**: Hub의 [Inference providers](https://huggingface.co/docs/inference-providers/index)나 OpenAI, Anthropic 등의 API를 통해 접근하거나, LiteLLM 통합으로 다양한 LLM을 쉽게 연결할 수 있습니다. Transformers나 Ollama를 사용한 로컬 실행도 가능합니다. 원하는 LLM으로 에이전트를 구동하는 것이 간단하고 유연합니다.\n\n👁️ **모달리티 독립적**: 텍스트뿐만 아니라 비전, 비디오, 오디오 입력도 처리할 수 있어 활용 가능한 애플리케이션 범위가 확장됩니다. 비전 관련 [튜토리얼](examples/web_browser)을 확인해보세요.\n\n🛠️ **도구 독립적**: [MCP 서버](reference/tools#smolagents.ToolCollection.from_mcp)의 도구나 [LangChain](reference/tools#smolagents.Tool.from_langchain)의 도구를 사용할 수 있고, [Hub Space](reference/tools#smolagents.Tool.from_space)도 도구로 활용할 수 있습니다.\n\n💻 **CLI 도구**: 보일러플레이트 코드 작성 없이 에이전트를 빠르게 실행할 수 있는 명령줄 유틸리티(smolagent, webagent)가 포함되어 있습니다.\n\n## 빠른 시작[[quickstart]]\n\n[[open-in-colab]]\n\nsmolagents를 단 몇 분 만에 시작해보세요! 이 가이드는 첫 번째 에이전트를 생성하고 실행하는 방법을 보여줍니다.\n\n### 설치[[installation]]\n\npip으로 smolagents를 설치하세요:\n\n```bash\npip install 'smolagents[toolkit]'  # 웹 검색과 같은 기본 도구 포함\n```\n\n### 첫 에이전트 만들기[[create-your-first-agent]]\n\n다음은 에이전트를 생성하고 실행하는 최소한의 예제입니다:\n\n```python\nfrom smolagents import CodeAgent, InferenceClientModel\n\n# 모델 초기화 (Hugging Face Inference API 사용)\nmodel = InferenceClientModel()  # 기본 모델 사용\n\n# 도구 없이 에이전트 생성\nagent = CodeAgent(tools=[], model=model)\n\n# 작업으로 에이전트 실행\nresult = agent.run(\"Calculate the sum of numbers from 1 to 10\")\nprint(result)\n```\n\n끝입니다! 에이전트가 Python 코드를 사용하여 작업을 해결하고 결과를 반환합니다.\n\n### 도구 추가[[adding-tools]]\n\n몇 가지 도구를 추가하여 에이전트를 더 강력하게 만들어보겠습니다:\n\n```python\nfrom smolagents import CodeAgent, InferenceClientModel, DuckDuckGoSearchTool\n\nmodel = InferenceClientModel()\nagent = CodeAgent(\n    tools=[DuckDuckGoSearchTool()],\n    model=model,\n)\n\n# 이제 에이전트가 웹을 검색할 수 있습니다!\nresult = agent.run(\"What is the current weather in Paris?\")\nprint(result)\n```\n\n### 다른 모델 사용하기[[using-different-models]]\n\n에이전트와 함께 다양한 모델을 사용할 수 있습니다:\n\n```python\n# Hugging Face의 특정 모델 사용\nmodel = InferenceClientModel(model_id=\"meta-llama/Llama-2-70b-chat-hf\")\n\n# OpenAI/Anthropic 사용 ('smolagents[litellm]' 필요)\nfrom smolagents import LiteLLMModel\nmodel = LiteLLMModel(model_id=\"gpt-4\")\n\n# 로컬 모델 사용 ('smolagents[transformers]' 필요)\nfrom smolagents import TransformersModel\nmodel = TransformersModel(model_id=\"meta-llama/Llama-2-7b-chat-hf\")\n```\n\n## 다음 단계[[next-steps]]\n\n- [설치 가이드](installation)에서 다양한 모델과 도구로 smolagents를 설정하는 방법을 알아보세요\n- 더 고급 기능은 [안내서](guided_tour)를 확인하세요\n- [커스텀 도구 구축](tutorials/tools)에 대해 알아보세요\n- [안전한 코드 실행](tutorials/secure_code_execution)을 살펴보세요\n- [멀티 에이전트 시스템](tutorials/building_good_agents) 생성 방법을 확인하세요\n\n<div class=\"mt-10\">\n  <div class=\"w-full flex flex-col space-y-4 md:space-y-0 md:grid md:grid-cols-2 md:gap-y-4 md:gap-x-5\">\n    <a class=\"!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg\" href=\"./guided_tour\"\n      ><div class=\"w-full text-center bg-gradient-to-br from-blue-400 to-blue-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed\">안내서</div>\n      <p class=\"text-gray-700\">기본 사항을 배우고 에이전트 사용에 익숙해지세요. 에이전트를 처음 사용하는 경우 여기서 시작하세요!</p>\n    </a>\n    <a class=\"!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg\" href=\"./examples/text_to_sql\"\n      ><div class=\"w-full text-center bg-gradient-to-br from-indigo-400 to-indigo-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed\">실습 가이드</div>\n      <p class=\"text-gray-700\">특정 목표를 달성하는 데 도움이 되는 실용적인 가이드: SQL 쿼리를 생성하고 테스트하는 에이전트를 만들어보세요!</p>\n    </a>\n    <a class=\"!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg\" href=\"./conceptual_guides/intro_agents\"\n      ><div class=\"w-full text-center bg-gradient-to-br from-pink-400 to-pink-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed\">개념 가이드</div>\n      <p class=\"text-gray-700\">중요한 주제에 대한 전체적인 이해를 돕는 설명입니다.</p>\n   </a>\n    <a class=\"!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg\" href=\"./tutorials/building_good_agents\"\n      ><div class=\"w-full text-center bg-gradient-to-br from-purple-400 to-purple-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed\">튜토리얼</div>\n      <p class=\"text-gray-700\">에이전트 구축의 중요한 측면을 다루는 포괄적인 튜토리얼입니다.</p>\n    </a>\n  </div>\n</div>\n"
  },
  {
    "path": "docs/source/ko/installation.md",
    "content": "# 설치 옵션[[installation-options]]\n\n`smolagents` 라이브러리는 pip를 사용하여 설치할 수 있습니다. 사용 가능한 다양한 설치 방법과 옵션을 소개합니다.\n\n## 사전 요구사항[[prerequisites]]\n- Python 3.10 이상\n- Python 패키지 관리자: [`pip`](https://pip.pypa.io/en/stable/) 또는 [`uv`](https://docs.astral.sh/uv/)\n\n## 가상 환경[[virtual-environment]]\n\n`smolagents`를 Python 가상 환경 내에서 설치하는 것을 강력히 권장합니다.\n가상 환경은 프로젝트의 의존성을 다른 Python 프로젝트와 시스템 Python 설치로부터 격리하여\n버전 충돌을 방지하고 패키지 관리를 더욱 안정적으로 만들어줍니다.\n\n<hfoptions id=\"virtual-environment\">\n<hfoption id=\"venv\">\n\n[`venv`](https://docs.python.org/3/library/venv.html) 사용:\n\n```bash\npython -m venv .venv\nsource .venv/bin/activate\n```\n\n</hfoption>\n<hfoption id=\"uv\">\n\n[`uv`](https://docs.astral.sh/uv/) 사용:\n\n```bash\nuv venv .venv\nsource .venv/bin/activate\n```\n\n</hfoption>\n</hfoptions>\n\n## 기본 설치[[basic-installation]]\n\n`smolagents` 핵심 라이브러리를 설치합니다:\n\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n```bash\npip install smolagents\n```\n</hfoption>\n<hfoption id=\"uv\">\n```bash\nuv pip install smolagents\n```\n</hfoption>\n</hfoptions>\n\n## 추가 기능과 함께 설치[[installation-with-extras]]\n\n`smolagents`는 필요에 따라 설치할 수 있는 여러 선택적 의존성(extras)을 제공합니다.\n다음 구문을 사용하여 이러한 추가 기능을 설치할 수 있습니다:\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n```bash\npip install \"smolagents[extra1,extra2]\"\n```\n</hfoption>\n<hfoption id=\"uv\">\n```bash\nuv pip install \"smolagents[extra1,extra2]\"\n```\n</hfoption>\n</hfoptions>\n\n### 도구[[tools]]\n다음 추가 기능은 다양한 도구와 통합을 포함합니다:\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n- **toolkit**: 일반적인 작업을 위한 기본 도구 세트를 설치합니다.\n  ```bash\n  pip install \"smolagents[toolkit]\"\n  ```\n- **mcp**: 외부 도구 및 서비스와 통합하기 위한 Model Context Protocol (MCP) 지원을 추가합니다.\n  ```bash\n  pip install \"smolagents[mcp]\"\n  ```\n</hfoption>\n<hfoption id=\"uv\">\n- **toolkit**: 일반적인 작업을 위한 기본 도구 세트를 설치합니다.\n  ```bash\n  uv pip install \"smolagents[toolkit]\"\n  ```\n- **mcp**: 외부 도구 및 서비스와 통합하기 위한 Model Context Protocol (MCP) 지원을 추가합니다.\n  ```bash\n  uv pip install \"smolagents[mcp]\"\n  ```\n</hfoption>\n</hfoptions>\n\n### 모델 통합[[model-integration]]\n다음 추가 기능은 다양한 AI 모델 및 프레임워크와의 통합을 가능하게 합니다:\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n- **openai**: OpenAI API 모델 지원을 추가합니다.\n  ```bash\n  pip install \"smolagents[openai]\"\n  ```\n- **transformers**: Hugging Face 트랜스포머 모델을 활성화합니다.\n  ```bash\n  pip install \"smolagents[transformers]\"\n  ```\n- **vllm**: 효율적인 모델 추론을 위한 VLLM 지원을 추가합니다.\n  ```bash\n  pip install \"smolagents[vllm]\"\n  ```\n- **mlx-lm**: MLX-LM 모델 지원을 활성화합니다.\n  ```bash\n  pip install \"smolagents[mlx-lm]\"\n  ```\n- **litellm**: 경량 모델 추론을 위한 LiteLLM 지원을 추가합니다.\n  ```bash\n  pip install \"smolagents[litellm]\"\n  ```\n- **bedrock**: AWS Bedrock 모델 지원을 활성화합니다.\n  ```bash\n  pip install \"smolagents[bedrock]\"\n  ```\n</hfoption>\n<hfoption id=\"uv\">\n- **openai**: OpenAI API 모델 지원을 추가합니다.\n  ```bash\n  uv pip install \"smolagents[openai]\"\n  ```\n- **transformers**: Hugging Face 트랜스포머 모델을 활성화합니다.\n  ```bash\n  uv pip install \"smolagents[transformers]\"\n  ```\n- **vllm**: 효율적인 모델 추론을 위한 VLLM 지원을 추가합니다.\n  ```bash\n  uv pip install \"smolagents[vllm]\"\n  ```\n- **mlx-lm**: MLX-LM 모델 지원을 활성화합니다.\n  ```bash\n  uv pip install \"smolagents[mlx-lm]\"\n  ```\n- **litellm**: 경량 모델 추론을 위한 LiteLLM 지원을 추가합니다.\n  ```bash\n  uv pip install \"smolagents[litellm]\"\n  ```\n- **bedrock**: AWS Bedrock 모델 지원을 활성화합니다.\n  ```bash\n  uv pip install \"smolagents[bedrock]\"\n  ```\n</hfoption>\n</hfoptions>\n\n### 멀티모달 기능[[multimodal-capabilities]]\n다양한 미디어 유형 및 입력 처리를 위한 추가 기능:\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n- **vision**: 이미지 처리 및 컴퓨터 비전 작업 지원을 추가합니다.\n  ```bash\n  pip install \"smolagents[vision]\"\n  ```\n- **audio**: 오디오 처리 기능을 활성화합니다.\n  ```bash\n  pip install \"smolagents[audio]\"\n  ```\n</hfoption>\n<hfoption id=\"uv\">\n- **vision**: 이미지 처리 및 컴퓨터 비전 작업 지원을 추가합니다.\n  ```bash\n  uv pip install \"smolagents[vision]\"\n  ```\n- **audio**: 오디오 처리 기능을 활성화합니다.\n  ```bash\n  uv pip install \"smolagents[audio]\"\n  ```\n</hfoption>\n</hfoptions>\n\n### 원격 실행[[remote-execution]]\n코드를 원격으로 실행하기 위한 추가 기능:\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n- **docker**: Docker 컨테이너에서 코드를 실행하는 지원을 추가합니다.\n  ```bash\n  pip install \"smolagents[docker]\"\n  ```\n- **e2b**: 원격 실행을 위한 E2B 지원을 활성화합니다.\n  ```bash\n  pip install \"smolagents[e2b]\"\n  ```\n</hfoption>\n<hfoption id=\"uv\">\n- **docker**: Docker 컨테이너에서 코드를 실행하는 지원을 추가합니다.\n  ```bash\n  uv pip install \"smolagents[docker]\"\n  ```\n- **e2b**: 원격 실행을 위한 E2B 지원을 활성화합니다.\n  ```bash\n  uv pip install \"smolagents[e2b]\"\n  ```\n</hfoption>\n</hfoptions>\n\n### 텔레메트리 및 사용자 인터페이스[[telemetry-and-user-interface]]\n텔레메트리, 모니터링 및 사용자 인터페이스 구성 요소를 위한 추가 기능:\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n- **telemetry**: 모니터링 및 추적 지원을 추가합니다.\n  ```bash\n  pip install \"smolagents[telemetry]\"\n  ```\n- **gradio**: 대화형 Gradio UI 구성 요소 지원을 추가합니다.\n  ```bash\n  pip install \"smolagents[gradio]\"\n  ```\n</hfoption>\n<hfoption id=\"uv\">\n- **telemetry**: 모니터링 및 추적 지원을 추가합니다.\n  ```bash\n  uv pip install \"smolagents[telemetry]\"\n  ```\n- **gradio**: 대화형 Gradio UI 구성 요소 지원을 추가합니다.\n  ```bash\n  uv pip install \"smolagents[gradio]\"\n  ```\n</hfoption>\n</hfoptions>\n\n### 전체 설치[[complete-installation]]\n사용 가능한 모든 추가 기능을 설치하려면 다음을 사용할 수 있습니다:\n<hfoptions id=\"installation\">\n<hfoption id=\"pip\">\n```bash\npip install \"smolagents[all]\"\n```\n</hfoption>\n<hfoption id=\"uv\">\n```bash\nuv pip install \"smolagents[all]\"\n```\n</hfoption>\n</hfoptions>\n\n## 설치 확인[[verifying-installation]]\n설치 후, 다음 코드를 실행해 `smolagents`가 올바르게 설치되었는지 확인할 수 있습니다:\n```python\nimport smolagents\nprint(smolagents.__version__)\n```\n\n## 다음 단계[[next-steps]]\n`smolagents`를 성공적으로 설치했다면 다음을 수행할 수 있습니다:\n- [안내서](./guided_tour)를 따라 기본 개념을 배워보세요.\n- 실용적인 예제를 보고 싶다면 [사용법 가이드](./examples/text_to_sql)를 살펴보세요.\n- 고수준 설명을 보려면 [개념 가이드](./conceptual_guides/intro_agents)를 읽어보세요.\n- 에이전트 구축에 대한 심화 튜토리얼은 [튜토리얼](./tutorials/building_good_agents)를 확인해보세요.\n- 클래스와 함수에 대한 자세한 정보를 확인하고 싶으시면 [API 레퍼런스](./reference/index)를 살펴보세요.\n"
  },
  {
    "path": "docs/source/ko/reference/agents.md",
    "content": "# 에이전트[[agents]]\n\n<Tip warning={true}>\n\nSmolagents는 실험적인 API로 언제든지 변경될 수 있습니다. API나 사용되는 모델이 변경될 수 있기 때문에 에이전트가 반환하는 결과도 달라질 수 있습니다.\n\n</Tip>\n\n에이전트와 도구에 대해 더 자세히 알아보려면 [소개 가이드](../index)를 꼭 읽어보세요. 이 페이지에는 기본 클래스에 대한 API 문서가 포함되어 있습니다.\n\n## 에이전트[[agents]]\n\n저희 에이전트는 [`MultiStepAgent`]를 상속받으며, 이는 하나의 생각과 하나의 도구 호출 및 실행으로 구성된 여러 단계를 수행할 수 있음을 의미합니다. 이 [개념 가이드](../conceptual_guides/react)에서 더 자세히 알아보세요.\n\n저희는 메인 [`Agent`] 클래스를 기반으로 두 가지 유형의 에이전트를 제공합니다.\n  - [`CodeAgent`]는 Python 코드로 도구 호출을 작성합니다.(이것이 기본값입니다.)\n  - [`ToolCallingAgent`]는 JSON 형식으로 도구 호출을 작성합니다.\n\n두 경우 모두 초기화 시 `model`과 도구 목록인 `tools`를 인수로 요구합니다.\n\n### 에이전트 클래스[[smolagents.MultiStepAgent]]\n\n[[autodoc]] MultiStepAgent\n\n[[autodoc]] CodeAgent\n\n[[autodoc]] ToolCallingAgent\n\n### stream_to_gradio[[smolagents.stream_to_gradio]]\n\n[[autodoc]] stream_to_gradio\n\n### GradioUI[[smolagents.GradioUI]]\n\n> [!TIP]\n> UI를 사용하려면 `gradio`가 설치되어 있어야 합니다. 설치되어 있지 않다면 `pip install 'smolagents[gradio]'`를 실행해주세요.\n\n[[autodoc]] GradioUI\n\n## 프롬프트[[smolagents.PromptTemplates]]\n\n[[autodoc]] smolagents.agents.PromptTemplates\n\n[[autodoc]] smolagents.agents.PlanningPromptTemplate\n\n[[autodoc]] smolagents.agents.ManagedAgentPromptTemplate\n\n[[autodoc]] smolagents.agents.FinalAnswerPromptTemplate\n\n## 메모리[[smolagents.AgentMemory]]\n\nSmolagents는 여러 단계에 걸쳐 정보를 저장하기 위해 메모리를 사용합니다.\n\n[[autodoc]] smolagents.memory.AgentMemory\n\n## Python 코드 실행기[[smolagents.PythonExecutor]]\n\n[[autodoc]] smolagents.local_python_executor.PythonExecutor\n\n### 로컬 Python 실행기[[smolagents.LocalPythonExecutor]]\n\n[[autodoc]] smolagents.local_python_executor.LocalPythonExecutor\n\n### 원격 Python 실행기[[smolagents.remote_executors.RemotePythonExecutor]]\n\n[[autodoc]] smolagents.remote_executors.RemotePythonExecutor\n\n#### E2BExecutor[[smolagents.E2BExecutor]]\n\n[[autodoc]] smolagents.remote_executors.E2BExecutor\n\n#### DockerExecutor[[smolagents.DockerExecutor]]\n\n[[autodoc]] smolagents.remote_executors.DockerExecutor\n\n#### WasmExecutor[[smolagents.WasmExecutor]]\n\n[[autodoc]] smolagents.remote_executors.WasmExecutor"
  },
  {
    "path": "docs/source/ko/reference/models.md",
    "content": "# 모델[[models]]\n\n<Tip warning={true}>\n\nSmolagents는 언제든지 변경될 수 있는 실험적인 API입니다. API 또는 기반 모델이 바뀌면 에이전트가 반환하는 결과도 달라질 수 있습니다.\n\n</Tip>\n\n에이전트와 도구에 대한 자세한 내용은 꼭 [소개 가이드](../index)를 읽어보시기 바랍니다. 이 페이지는 기반 클래스에 대한 API 문서를 포함하고 있습니다.\n\n## 모델[[models]]\n\nsmolagents의 모든 모델 클래스는 추가 키워드 인수(`temperature`, `max_tokens`, `top_p` 등)를 인스턴스화 시점에 바로 전달할 수 있습니다.\n이 파라미터들은 기반 모델의 생성 호출에 자동으로 전달되어, 창의성, 응답 길이, 샘플링 전략과 같은 모델의 동작을 설정할 수 있습니다.\n\n### 기본 모델[[smolagents.Model]]\n\n`Model` 클래스는 모든 모델 구현의 기반이 되는 클래스이며, 사용자 정의 모델이 에이전트와 함께 작동하기 위해 구현해야 하는 핵심 인터페이스를 제공합니다.\n\n[[autodoc]] Model\n\n### API 모델[[smolagents.ApiModel]]\n\n`ApiModel` 클래스는 모든 API 기반 모델 구현의 토대가 되며, 외부 API 상호 작용, 속도 제한, 클라이언트 관리 등 모델이 상속하는 공통 기능을 제공합니다.\n\n[[autodoc]] ApiModel\n\n### TransformersModel[[smolagents.TransformersModel]]\n\n편의를 위해, 초기화 시 주어진 model_id에 대한 로컬 `transformers` 파이프라인을 구축하여 위 사항들을 구현하는 `TransformersModel`을 추가했습니다.\n\n```python\nfrom smolagents import TransformersModel\n\nmodel = TransformersModel(model_id=\"HuggingFaceTB/SmolLM2-360M-Instruct\")\n\nprint(model([{\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"좋아!\"}]}], stop_sequences=[\"이\"]))\n```\n```text\n>>> 좋아! 아래와 같\n```\n\n기반 모델에서 지원하는 모든 키워드 인수(`temperature`, `max_new_tokens`, `top_p` 등)를 인스턴스화 시점에 직접 전달할 수 있습니다. 이들은 모델 생성 호출로 전달됩니다:\n\n```python\nmodel = TransformersModel(\n    model_id=\"HuggingFaceTB/SmolLM2-360M-Instruct\",\n    temperature=0.7,\n    max_new_tokens=1000\n)\n```\n\n> [!TIP]\n> 사용자의 컴퓨터에 `transformers`와 `torch`가 설치되어 있어야 합니다. 설치되지 않은 경우 `pip install 'smolagents[transformers]'`를 실행하십시오.\n\n[[autodoc]] TransformersModel\n\n### InferenceClientModel[[smolagents.InferenceClientModel]]\n\n`InferenceClientModel`은 LLM 실행을 위해 huggingface_hub의 [InferenceClient](https://huggingface.co/docs/huggingface_hub/main/en/guides/inference)를 래핑합니다. 이는 Hub에서 사용할 수 있는 모든 [Inference Providers](https://huggingface.co/docs/inference-providers/index)를 지원합니다. Cerebras, Cohere, Fal, Fireworks, HF-Inference, Hyperbolic, Nebius, Novita, Replicate, SambaNova, Together 등이 있습니다.\n\n또한 `requests_per_minute` 인수를 사용하여 분당 요청 수로 속도 제한을 설정할 수 있습니다:\n\n```python\nfrom smolagents import InferenceClientModel\n\nmessages = [\n  {\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"안녕하세요, 잘 지내고 계신가요?\"}]}\n]\n\nmodel = InferenceClientModel(provider=\"novita\", requests_per_minute=60)\nprint(model(messages))\n```\n```text\n>>> 안녕하세요. 덕분에 잘 지내고 있습니다.\n```\n\n기반 모델에서 지원하는 모든 키워드 인수(`temperature`, `max_tokens`, `top_p` 등)를 인스턴스화 시점에 직접 전달할 수 있습니다. 이들은 모델 생성 호출로 전달됩니다:\n\n```python\nmodel = InferenceClientModel(\n    provider=\"novita\",\n    requests_per_minute=60,\n    temperature=0.8,\n    max_tokens=500\n)\n```\n\n[[autodoc]] InferenceClientModel\n\n### LiteLLMModel[[smolagents.LiteLLMModel]]\n\n`LiteLLMModel`은 [LiteLLM](https://www.litellm.ai/)을 활용하여 다양한 제공업체의 100개 이상의 LLM을 지원합니다.\n모델 초기화 시 키워드 인수를 전달하면, 이후 모델을 사용할 때마다 해당 설정이 적용됩니다. 예를 들어 아래에서는 `temperature`를 전달합니다. 또한 `requests_per_minute` 인수를 통해 분당 요청 수를 제한할 수도 있습니다.\n\n```python\nfrom smolagents import LiteLLMModel\n\nmessages = [\n  {\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"안녕하세요, 잘 지내고 계신가요?\"}]}\n]\n\nmodel = LiteLLMModel(model_id=\"anthropic/claude-3-5-sonnet-latest\", temperature=0.2, max_tokens=10, requests_per_minute=60)\nprint(model(messages))\n```\n\n[[autodoc]] LiteLLMModel\n\n### LiteLLMRouterModel[[smolagents.LiteLLMRouterModel]]\n\n`LiteLLMRouterModel`은 [LiteLLM Router](https://docs.litellm.ai/docs/routing)를 감싼 래퍼로, 다양한 고급 라우팅 전략을 지원합니다. 예를 들어, 여러 배포 환경 간 로드 밸런싱, 큐 기반의 중요 요청 우선 처리, 쿨다운, 폴백, 지수적 백오프 재시도 같은 기본 신뢰성 조치 구현 기능을 제공합니다.\n\n```python\nfrom smolagents import LiteLLMRouterModel\n\nmessages = [\n  {\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"안녕하세요, 잘 지내고 계신가요?\"}]}\n]\n\nmodel = LiteLLMRouterModel(\n    model_id=\"llama-3.3-70b\",\n    model_list=[\n        {\n            \"model_name\": \"llama-3.3-70b\",\n            \"litellm_params\": {\"model\": \"groq/llama-3.3-70b\", \"api_key\": os.getenv(\"GROQ_API_KEY\")},\n        },\n        {\n            \"model_name\": \"llama-3.3-70b\",\n            \"litellm_params\": {\"model\": \"cerebras/llama-3.3-70b\", \"api_key\": os.getenv(\"CEREBRAS_API_KEY\")},\n        },\n    ],\n    client_kwargs={\n        \"routing_strategy\": \"simple-shuffle\",\n    },\n)\nprint(model(messages))\n```\n\n[[autodoc]] LiteLLMRouterModel\n\n### OpenAIModel[[smolagents.OpenAIModel]]\n\n이 클래스를 사용하면 OpenAIServer와 호환되는 모든 모델을 호출할 수 있습니다. 설정 방법은 다음과 같습니다 (`api_base` url을 다른 서버를 가리키도록 사용자 정의할 수 있습니다):\n```py\nimport os\nfrom smolagents import OpenAIModel\n\nmodel = OpenAIModel(\n    model_id=\"gpt-4o\",\n    api_base=\"https://api.openai.com/v1\",\n    api_key=os.environ[\"OPENAI_API_KEY\"],\n)\n```\n\n기반 모델에서 지원하는 모든 키워드 인수(`temperature`, `max_tokens`, `top_p` 등)를 인스턴스화 시점에 직접 전달할 수 있습니다. 이들은 모델 생성 호출로 전달됩니다:\n\n```py\nmodel = OpenAIModel(\n    model_id=\"gpt-4o\",\n    api_base=\"https://api.openai.com/v1\",\n    api_key=os.environ[\"OPENAI_API_KEY\"],\n    temperature=0.7,\n    max_tokens=1000,\n    top_p=0.9,\n)\n```\n\n[[autodoc]] OpenAIModel\n\n### AzureOpenAIModel[[smolagents.AzureOpenAIModel]]\n\n`AzureOpenAIModel`을 사용하면 모든 Azure OpenAI 배포에 연결할 수 있습니다.\n\n아래에서 설정 예시를 확인할 수 있습니다. `azure_endpoint`, `api_key`, `api_version` 인수는 환경 변수(`AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_API_KEY`, `OPENAI_API_VERSION`)를 설정해 두면 생략할 수 있습니다.\n\n`OPENAI_API_VERSION`에 `AZURE_` 접두사가 포함되지 않는다는 점을 주의하시기 바랍니다. 이는 기반이 되는 [openai](https://github.com/openai/openai-python) 패키지의 설계 방식 때문입니다.\n\n```py\nimport os\n\nfrom smolagents import AzureOpenAIModel\n\nmodel = AzureOpenAIModel(\n    model_id = os.environ.get(\"AZURE_OPENAI_MODEL\"),\n    azure_endpoint=os.environ.get(\"AZURE_OPENAI_ENDPOINT\"),\n    api_key=os.environ.get(\"AZURE_OPENAI_API_KEY\"),\n    api_version=os.environ.get(\"OPENAI_API_VERSION\")    \n)\n```\n\n[[autodoc]] AzureOpenAIModel\n\n### AmazonBedrockModel[[smolagents.AmazonBedrockModel]]\n\n`AmazonBedrockModel`은 Amazon Bedrock에 연결하고, 사용할 수 있는 모든 모델에서 에이전트를 실행할 수 있도록 지원합니다.\n\n아래는 설정 예시입니다. 이 클래스는 사용자 정의를 위한 추가 옵션도 제공합니다.\n\n```py\nimport os\n\nfrom smolagents import AmazonBedrockModel\n\nmodel = AmazonBedrockModel(\n    model_id = os.environ.get(\"AMAZON_BEDROCK_MODEL_ID\"),\n)\n```\n\n[[autodoc]] AmazonBedrockModel\n\n### MLXModel[[smolagents.MLXModel]]\n\n\n```python\nfrom smolagents import MLXModel\n\nmodel = MLXModel(model_id=\"HuggingFaceTB/SmolLM-135M-Instruct\")\n\nprint(model([{\"role\": \"user\", \"content\": \"좋아!\"}], stop_sequences=[\"이\"]))\n```\n```text\n>>> 좋아! 아래와 같\n```\n\n> [!TIP]\n> 사용자의 컴퓨터에 `mlx-lm`이 설치되어 있어야 합니다. 설치되지 않은 경우 `pip install 'smolagents[mlx-lm]'`를 실행해 설치합니다.\n\n[[autodoc]] MLXModel\n\n### VLLMModel[[smolagents.VLLMModel]]\n\n빠른 LLM 추론 및 서빙을 위해 [vLLM](https://docs.vllm.ai/)을 사용하는 모델입니다.\n\n```python\nfrom smolagents import VLLMModel\n\nmodel = VLLMModel(model_id=\"HuggingFaceTB/SmolLM2-360M-Instruct\")\n\nprint(model([{\"role\": \"user\", \"content\": \"좋아!\"}], stop_sequences=[\"이\"]))\n```\n\n> [!TIP]\n> 사용자의 컴퓨터에 `vllm`이 설치되어 있어야 합니다. 설치되지 않은 경우 `pip install 'smolagents[vllm]'`를 실행하세요.\n\n[[autodoc]] VLLMModel\n\n### 사용자 정의 모델[[custom-model]]\n\n자유롭게 자신만의 모델을 만들어 에이전트를 구동하는 데 사용할 수 있습니다.\n\n기본 `Model` 클래스를 상속받아 에이전트를 위한 모델을 만들 수 있습니다.\n주요 기준은 `generate` 메소드를 오버라이드하는 것이며, 다음 두 가지 기준을 따릅니다:\n1. 입력으로 전달되는 `messages`는 [메시지 형식](./chat_templating)(`List[Dict[str, str]]`)을 따라야 하며 `.content` 속성을 가진 객체를 반환합니다.\n2. `stop_sequences` 인수로 전달된 시퀀스에서 출력을 중단합니다.\n\nLLM을 정의하기 위해, 기본 `Model` 클래스를 상속하는 `CustomModel` 클래스를 만들 수 있습니다.\n이 클래스는 [메시지](./chat_templating) 리스트를 받아 텍스트를 포함하는 `.content` 속성을 가진 객체를 반환하는 `generate` 메소드를 가져야 합니다. `generate` 메소드는 또한 생성을 중단할 시점을 나타내는 `stop_sequences` 인수를 받아들여야 합니다.\n\n```python\nfrom huggingface_hub import login, InferenceClient\n\nfrom smolagents import Model\n\nlogin(\"<YOUR_HUGGINGFACEHUB_API_TOKEN>\")\n\nmodel_id = \"meta-llama/Llama-3.3-70B-Instruct\"\n\nclient = InferenceClient(model=model_id)\n\nclass CustomModel(Model):\n    def generate(messages, stop_sequences=[\"Task\"]):\n        response = client.chat_completion(messages, stop=stop_sequences, max_tokens=1024)\n        answer = response.choices[0].message\n        return answer\n\ncustom_model = CustomModel()\n```\n\n또한, `generate` 메소드는 `grammar` 인수를 받아 [제약된 생성](https://huggingface.co/docs/text-generation-inference/conceptual/guidance)을 허용하여 올바른 형식의 에이전트 출력을 강제할 수 있습니다.\n"
  },
  {
    "path": "docs/source/ko/reference/tools.md",
    "content": "# 도구[[tools]]\n\n<Tip warning={true}>\n\nSmolagents는 언제든지 변경될 수 있는 실험적인 API입니다. API나 사용되는 모델이 변경될 수 있기 때문에 에이전트가 반환하는 결과도 달라질 수 있습니다.\n\n</Tip>\n\n에이전트와 도구에 대해 더 자세히 알아보려면 [소개 가이드](../index)를 꼭 읽어보세요. 이 페이지에는 기본 클래스에 대한 API 문서가 포함되어 있습니다.\n\n## 도구 기본 클래스[[tool-base-classes]]\n\n### load_tool[[smolagents.load_tool]]\n\n[[autodoc]] load_tool\n\n### tool[[smolagents.tool]]\n\n[[autodoc]] tool\n\n### Tool[[smolagents.Tool]]\n\n[[autodoc]] Tool\n\n### launch_gradio_demo[[smolagents.launch_gradio_demo]]\n\n[[autodoc]] launch_gradio_demo\n\n## ToolCollection[[smolagents.ToolCollection]]\n\n[[autodoc]] ToolCollection\n\n## MCP 클라이언트[[smolagents.MCPClient]]\n\n[[autodoc]] smolagents.mcp_client.MCPClient\n\n## 에이전트 타입[[agent-types]]\n\n에이전트는 도구 간에 모든 유형의 객체를 처리할 수 있습니다. 각 도구는 완전한 멀티모달을 지원하므로 텍스트, 이미지, 오디오, 비디오 등 다양한 형태의 데이터를 입력받거나 반환할 수 있습니다. 도구 간의 호환성을 높이고, ipython(jupyter, colab, ipython 노트북 등)에서 이러한 반환값을 올바르게 렌더링되록 하기 위해 래퍼 클래스를 구현하여 이러한 타입들을 감쌉니다.\n\n래퍼 객체는 본래의 동작을 유지해야 합니다. 예를 들어, 텍스트 객체는 여전히 문자열처럼 동작해야 하고, 이미지 객체는 여전히 `PIL.Image`처럼 동작해야 합니다.\n\n이러한 타입들은 세 가지 특정 목적을 가집니다:\n\n- 타입에 `to_raw`를 호출하면 기본 객체를 반환해야 합니다.\n- 타입에 `to_string`을 호출하면 객체를 문자열로 반환해야 합니다. `AgentText`의 경우에는 해당 문자열이 될 수 있지만, 그 외의 다른 인스턴스에서는 객체의 직렬화된 버전의 경로가 반환됩니다.\n- ipython 커널에 표시할 때 객체가 올바르게 표시되어야 합니다.\n\n### AgentText[[smolagents.AgentText]]\n\n[[autodoc]] smolagents.agent_types.AgentText\n\n### AgentImage[[smolagents.AgentImage]]\n\n[[autodoc]] smolagents.agent_types.AgentImage\n\n### AgentAudio[[smolagents.AgentAudio]]\n\n[[autodoc]] smolagents.agent_types.AgentAudio"
  },
  {
    "path": "docs/source/ko/tutorials/building_good_agents.md",
    "content": "# 좋은 에이전트 구축하기[[building-good-agents]]\n\n[[open-in-colab]]\n\n성공하는 에이전트와 실패하는 에이전트 사이에는 큰 차이가 있습니다.\n성공하는 에이전트는 어떻게 만들 수 있을까요?\n이 가이드에서 에이전트 구축의 핵심 원칙들을 소개하겠습니다.\n\n> [!TIP]\n> 에이전트 구축이 처음이라면 먼저 [에이전트 소개](../conceptual_guides/intro_agents)와 [안내서](../guided_tour)를 읽어보세요.\n\n### 최고의 에이전트 시스템은 가장 단순합니다: 워크플로우를 최대한 단순하게 만드세요[[the-best-agentic-systems-are-the-simplest:-simplify-the-workflow-as-much-as-you-can]]\n\n워크플로우에 LLM에게 어느 정도의 자율성을 부여하는 것은 오류가 발생할 위험이 있습니다.\n\n잘 설계된 에이전트 시스템은 오류를 기록하고 다시 시도하는 기능을 통해 LLM이 자신의 실수를 교정할 수 있게 해줍니다. 그렇다고 해도 처음부터 LLM이 실수하지 않도록 워크플로우를 간단하게 만드는 것이 훨씬 효과적입니다.\n\n[에이전트 소개](../conceptual_guides/intro_agents)의 예시를 다시 살펴보겠습니다: 서핑 여행사 이용자들의 문의에 대응하는 봇입니다.\n새로운 서핑 스팟에 대해 질문을 받을 때마다 에이전트가 \"여행 거리 API\"와 \"날씨 API\"에 각각 2번의 서로 다른 호출을 하도록 하는 대신, 두 API를 한 번에 호출하고 연결된 출력을 사용자에게 반환하는 함수인 \"return_spot_information\"이라는 하나의 통합된 도구를 만들 수 있습니다.\n\n이렇게 하면 비용, 지연 시간, 오류 위험을 줄일 수 있습니다!\n\n주요 지침은 다음과 같습니다: LLM 호출 횟수를 최대한 줄이세요.\n\n이것은 몇 가지 결론으로 이어집니다:\n- 가능하면 언제든지 두 개의 API 예시처럼 2개의 도구를 하나로 그룹화하세요.\n- 가능하면 언제든지 로직은 에이전트의 결정보다는 결정론적 함수로 처리해야 합니다.\n\n### LLM 엔진으로의 정보 흐름을 개선하세요[[improve-the-information-flow-to-the-llm-engine]]\n\nLLM은 쪽지를 통해서만 소통할 수 있는 밀폐된 방 안의 *똑똑한* 로봇이라고 생각하면 됩니다.\n\n프롬프트에 명시하지 않으면 무슨 일이 일어났는지 전혀 알 수 없습니다.\n\n그러니까 일단 작업을 아주 명확하게 정의하는 것부터 시작하세요!\n에이전트는 LLM으로 작동하기 때문에, 작업을 설명하는 방식이 조금만 달라져도 결과가 완전히 바뀔 수 있습니다.\n\n그 다음엔 도구에서 에이전트로 정보가 잘 전달되도록 개선해야 합니다.\n\n구체적으로는 이렇게 하세요:\n- 각 도구는 LLM에게 도움이 될 만한 정보를 모두 기록해야 합니다.(도구의 `forward` 메서드 안에서 `print`문을 쓰기만 하면 됩니다.)\n  - 특히 도구 실행 오류에 대한 자세한 정보를 기록하면 큰 도움이 됩니다!\n\n예를 들어 위치와 날짜-시간을 받아서 날씨 데이터를 가져오는 도구를 보겠습니다:\n\n먼저 좋지 않은 버전입니다:\n```python\nimport datetime\nfrom smolagents import tool\n\ndef get_weather_report_at_coordinates(coordinates, date_time):\n    # 더미 함수, [섭씨 온도, 0-1 척도의 비 올 확률, 미터 단위 파도 높이] 리스트를 반환\n    return [28.0, 0.35, 0.85]\n\ndef convert_location_to_coordinates(location):\n    # 더미 좌표를 반환\n    return [3.3, -42.0]\n\n@tool\ndef get_weather_api(location: str, date_time: str) -> str:\n    \"\"\"\n    Returns the weather report.\n\n    Args:\n        location: the name of the place that you want the weather for.\n        date_time: the date and time for which you want the report.\n    \"\"\"\n    lon, lat = convert_location_to_coordinates(location)\n    date_time = datetime.strptime(date_time)\n    return str(get_weather_report_at_coordinates((lon, lat), date_time))\n```\n\n문제점은 무엇일까요?\n- `date_time`에 사용해야 하는 형식에 대한 정확한 설명이 없습니다.\n- 위치를 어떻게 지정해야 하는지에 대한 세부 정보가 없습니다.\n- 위치가 적절한 형식이 아니거나 `date_time`이 제대로 형식화되지 않은 경우와 같은 실패 사례를 명시적으로 기록할 수 있는 로깅 메커니즘이 없습니다.\n- 출력 형식을 이해하기 어렵습니다.\n\n도구 호출이 실패하면 메모리에 로깅된 오류 추적이 LLM이 도구를 역설계하여 오류를 수정하는 데 도움이 될 수 있습니다. 하지만 왜 그렇게 많은 무거운 작업을 맡겨야 할까요?\n\n이 도구를 구축하는 더 나은 방법은 다음과 같습니다:\n```python\n@tool\ndef get_weather_api(location: str, date_time: str) -> str:\n    \"\"\"\n    Returns the weather report.\n\n    Args:\n        location: the name of the place that you want the weather for. Should be a place name, followed by possibly a city name, then a country, like \"Anchor Point, Taghazout, Morocco\".\n        date_time: the date and time for which you want the report, formatted as '%m/%d/%y %H:%M:%S'.\n    \"\"\"\n    lon, lat = convert_location_to_coordinates(location)\n    try:\n        date_time = datetime.strptime(date_time)\n    except Exception as e:\n        raise ValueError(\"Conversion of `date_time` to datetime format failed, make sure to provide a string in format '%m/%d/%y %H:%M:%S'. Full trace:\" + str(e))\n    temperature_celsius, risk_of_rain, wave_height = get_weather_report_at_coordinates((lon, lat), date_time)\n    return f\"Weather report for {location}, {date_time}: Temperature will be {temperature_celsius}°C, risk of rain is {risk_of_rain*100:.0f}%, wave height is {wave_height}m.\"\n```\n\nLLM의 부담을 덜어주려면 이런 질문을 해보세요: \"만약 내가 아무것도 모르는 상태에서 이 도구를 처음 사용한다면, 실수했을 때 스스로 고치기가 얼마나 쉬울까?\"\n\n### 에이전트에 더 많은 매개변수 제공[[give-more-arguments-to-the-agent]]\n\n작업을 설명하는 단순한 문자열 외에 에이전트에 추가 객체를 전달하려면 `additional_args` 매개변수를 사용하여 모든 유형의 객체를 전달할 수 있습니다:\n\n```py\nfrom smolagents import CodeAgent, InferenceClientModel\n\nmodel_id = \"meta-llama/Llama-3.3-70B-Instruct\"\n\nagent = CodeAgent(tools=[], model=InferenceClientModel(model_id=model_id), add_base_tools=True)\n\nagent.run(\n    \"Why does Mike not know many people in New York?\",\n    additional_args={\"mp3_sound_file_url\":'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/recording.mp3'}\n)\n```\n예를 들어, `additional_args` 매개변수를 통해 에이전트가 활용할 수 있도록 원하는 이미지나 문자열을 전달할 수 있습니다.\n\n## 에이전트 디버깅 방법[[how-to-debug-your-agent]]\n\n### 1. 더 강력한 LLM 사용[[use-a-stronger-llm]]\n\n에이전트 워크플로우에서 발생하는 오류 중 일부는 실제 오류이고, 다른 일부는 LLM 엔진이 제대로 추론하지 못한 탓입니다.\n예를 들어, 자동차 그림을 만들어 달라고 요청한 `CodeAgent`에 대한 다음 추적을 고려해보세요:\n```\n==================================================================================================== New task ====================================================================================================\nMake me a cool car picture\n──────────────────────────────────────────────────────────────────────────────────────────────────── New step ────────────────────────────────────────────────────────────────────────────────────────────────────\nAgent is executing the code below: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\nimage_generator(prompt=\"A cool, futuristic sports car with LED headlights, aerodynamic design, and vibrant color, high-res, photorealistic\")\n──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\n\nLast output from code snippet: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\n/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png\nStep 1:\n\n- Time taken: 16.35 seconds\n- Input tokens: 1,383\n- Output tokens: 77\n──────────────────────────────────────────────────────────────────────────────────────────────────── New step ────────────────────────────────────────────────────────────────────────────────────────────────────\nAgent is executing the code below: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\nfinal_answer(\"/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png\")\n──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\nPrint outputs:\n\nLast output from code snippet: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\n/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png\nFinal answer:\n/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png\n```\n사용자는 이미지가 반환되는 대신 경로가 반환되는 것을 보게 됩니다.\n시스템의 버그처럼 보일 수 있지만, 실제로는 에이전트 시스템이 오류를 일으킨 것이 아닙니다: 단지 LLM이 이미지 출력을 변수에 저장하지 않는 실수를 했을 뿐입니다.\n따라서 이미지를 저장하면서 로깅된 경로를 활용하는 것 외에는 이미지에 다시 접근할 수 없으므로 이미지 대신 경로를 반환합니다.\n\n따라서 에이전트를 디버깅하는 첫 번째 단계는 \"더 강력한 LLM을 사용하는 것\"입니다. `Qwen2/5-72B-Instruct`와 같은 대안은 그런 실수를 하지 않았을 것입니다.\n\n### 2. 더 많은 정보나 구체적인 지침 제공[[provide-more-information-or-specific-instructions]]\n\n더 자세하게 안내해준다면 성능이 낮은 모델도 충분히 사용할 수 있습니다.\n\n모델의 관점에서 생각해보세요: 내가 모델이 되어서 이 작업을 해결해야 한다면, 지금 주어진 정보(시스템 프롬프트 + 작업 설명 + 도구 설명)만으로도 충분할까요?\n\n더 구체적인 안내가 필요할까요?\n\n- 지침이 항상 에이전트에게 주어져야 하는 경우(일반적으로 시스템 프롬프트가 작동한다고 이해하는 것처럼): 에이전트 초기화 시 `instructions` 매개변수 아래에 문자열로 전달할 수 있습니다.\n- 해결할 특정 작업에 관한 것이라면: 이 모든 세부 사항을 작업에 추가하세요. 작업은 수십 페이지처럼 매우 길 수 있습니다.\n- 특정 도구 사용 방법에 관한 것이라면: 해당 도구의 `description` 속성에 포함시키세요.\n\n### 3. 프롬프트 템플릿 변경 (일반적으로 권장되지 않음)[[change-the-prompt-templates-(generally-not-advised)]]\n\n위의 방법들로도 부족하다면 에이전트의 프롬프트 템플릿을 직접 수정할 수 있습니다.\n\n작동 원리를 살펴보겠습니다. [CodeAgent]의 기본 프롬프트 템플릿을 예로 들어보겠습니다(제로샷 예제는 생략하고 간단히 정리했습니다).\n\n```python\nprint(agent.prompt_templates[\"system_prompt\"])\n```\nHere is what you get:\n```text\nYou are an expert assistant who can solve any task using code blobs. You will be given a task to solve as best you can.\nTo do so, you have been given access to a list of tools: these tools are basically Python functions which you can call with code.\nTo solve the task, you must plan forward to proceed in a series of steps, in a cycle of Thought, Code, and Observation sequences.\n\nAt each step, in the 'Thought:' sequence, you should first explain your reasoning towards solving the task and the tools that you want to use.\nThen in the Code sequence you should write the code in simple Python. The code sequence must be opened with '{{code_block_opening_tag}}', and closed with '{{code_block_closing_tag}}'.\nDuring each intermediate step, you can use 'print()' to save whatever important information you will then need.\nThese print outputs will then appear in the 'Observation:' field, which will be available as input for the next step.\nIn the end you have to return a final answer using the `final_answer` tool.\n\nHere are a few examples using notional tools:\n---\nTask: \"Generate an image of the oldest person in this document.\"\n\nThought: I will proceed step by step and use the following tools: `document_qa` to find the oldest person in the document, then `image_generator` to generate an image according to the answer.\n{{code_block_opening_tag}}\nanswer = document_qa(document=document, question=\"Who is the oldest person mentioned?\")\nprint(answer)\n{{code_block_closing_tag}}\nObservation: \"The oldest person in the document is John Doe, a 55 year old lumberjack living in Newfoundland.\"\n\nThought: I will now generate an image showcasing the oldest person.\n{{code_block_opening_tag}}\nimage = image_generator(\"A portrait of John Doe, a 55-year-old man living in Canada.\")\nfinal_answer(image)\n{{code_block_closing_tag}}\n\n---\nTask: \"What is the result of the following operation: 5 + 3 + 1294.678?\"\n\nThought: I will use python code to compute the result of the operation and then return the final answer using the `final_answer` tool\n{{code_block_opening_tag}}\nresult = 5 + 3 + 1294.678\nfinal_answer(result)\n{{code_block_closing_tag}}\n\n---\nTask:\n\"Answer the question in the variable `question` about the image stored in the variable `image`. The question is in French.\nYou have been provided with these additional arguments, that you can access using the keys as variables in your python code:\n{'question': 'Quel est l'animal sur l'image?', 'image': 'path/to/image.jpg'}\"\n\nThought: I will use the following tools: `translator` to translate the question into English and then `image_qa` to answer the question on the input image.\n{{code_block_opening_tag}}\ntranslated_question = translator(question=question, src_lang=\"French\", tgt_lang=\"English\")\nprint(f\"The translated question is {translated_question}.\")\nanswer = image_qa(image=image, question=translated_question)\nfinal_answer(f\"The answer is {answer}\")\n{{code_block_closing_tag}}\n\n---\nTask:\nIn a 1979 interview, Stanislaus Ulam discusses with Martin Sherwin about other great physicists of his time, including Oppenheimer.\nWhat does he say was the consequence of Einstein learning too much math on his creativity, in one word?\n\nThought: I need to find and read the 1979 interview of Stanislaus Ulam with Martin Sherwin.\n{{code_block_opening_tag}}\npages = web_search(query=\"1979 interview Stanislaus Ulam Martin Sherwin physicists Einstein\")\nprint(pages)\n{{code_block_closing_tag}}\nObservation:\nNo result found for query \"1979 interview Stanislaus Ulam Martin Sherwin physicists Einstein\".\n\nThought: The query was maybe too restrictive and did not find any results. Let's try again with a broader query.\n{{code_block_opening_tag}}\npages = web_search(query=\"1979 interview Stanislaus Ulam\")\nprint(pages)\n{{code_block_closing_tag}}\nObservation:\nFound 6 pages:\n[Stanislaus Ulam 1979 interview](https://ahf.nuclearmuseum.org/voices/oral-histories/stanislaus-ulams-interview-1979/)\n\n[Ulam discusses Manhattan Project](https://ahf.nuclearmuseum.org/manhattan-project/ulam-manhattan-project/)\n\n(truncated)\n\nThought: I will read the first 2 pages to know more.\n{{code_block_opening_tag}}\nfor url in [\"https://ahf.nuclearmuseum.org/voices/oral-histories/stanislaus-ulams-interview-1979/\", \"https://ahf.nuclearmuseum.org/manhattan-project/ulam-manhattan-project/\"]:\n    whole_page = visit_webpage(url)\n    print(whole_page)\n    print(\"\\n\" + \"=\"*80 + \"\\n\")  # Print separator between pages\n{{code_block_closing_tag}}\nObservation:\nManhattan Project Locations:\nLos Alamos, NM\nStanislaus Ulam was a Polish-American mathematician. He worked on the Manhattan Project at Los Alamos and later helped design the hydrogen bomb. In this interview, he discusses his work at\n(truncated)\n\nThought: I now have the final answer: from the webpages visited, Stanislaus Ulam says of Einstein: \"He learned too much mathematics and sort of diminished, it seems to me personally, it seems to me his purely physics creativity.\" Let's answer in one word.\n{{code_block_opening_tag}}\nfinal_answer(\"diminished\")\n{{code_block_closing_tag}}\n\n---\nTask: \"Which city has the highest population: Guangzhou or Shanghai?\"\n\nThought: I need to get the populations for both cities and compare them: I will use the tool `web_search` to get the population of both cities.\n{{code_block_opening_tag}}\nfor city in [\"Guangzhou\", \"Shanghai\"]:\n    print(f\"Population {city}:\", web_search(f\"{city} population\")\n{{code_block_closing_tag}}\nObservation:\nPopulation Guangzhou: ['Guangzhou has a population of 15 million inhabitants as of 2021.']\nPopulation Shanghai: '26 million (2019)'\n\nThought: Now I know that Shanghai has the highest population.\n{{code_block_opening_tag}}\nfinal_answer(\"Shanghai\")\n{{code_block_closing_tag}}\n\n---\nTask: \"What is the current age of the pope, raised to the power 0.36?\"\n\nThought: I will use the tool `wikipedia_search` to get the age of the pope, and confirm that with a web search.\n{{code_block_opening_tag}}\npope_age_wiki = wikipedia_search(query=\"current pope age\")\nprint(\"Pope age as per wikipedia:\", pope_age_wiki)\npope_age_search = web_search(query=\"current pope age\")\nprint(\"Pope age as per google search:\", pope_age_search)\n{{code_block_closing_tag}}\nObservation:\nPope age: \"The pope Francis is currently 88 years old.\"\n\nThought: I know that the pope is 88 years old. Let's compute the result using python code.\n{{code_block_opening_tag}}\npope_current_age = 88 ** 0.36\nfinal_answer(pope_current_age)\n{{code_block_closing_tag}}\n\nAbove example were using notional tools that might not exist for you. On top of performing computations in the Python code snippets that you create, you only have access to these tools, behaving like regular python functions:\n{{code_block_opening_tag}}\n{%- for tool in tools.values() %}\n{{ tool.to_code_prompt() }}\n{% endfor %}\n{{code_block_closing_tag}}\n\n{%- if managed_agents and managed_agents.values() | list %}\nYou can also give tasks to team members.\nCalling a team member works similarly to calling a tool: provide the task description as the 'task' argument. Since this team member is a real human, be as detailed and verbose as necessary in your task description.\nYou can also include any relevant variables or context using the 'additional_args' argument.\nHere is a list of the team members that you can call:\n{{code_block_opening_tag}}\n{%- for agent in managed_agents.values() %}\ndef {{ agent.name }}(task: str, additional_args: dict[str, Any]) -> str:\n    \"\"\"{{ agent.description }}\n\n    Args:\n        task: Long detailed description of the task.\n        additional_args: Dictionary of extra inputs to pass to the managed agent, e.g. images, dataframes, or any other contextual data it may need.\n    \"\"\"\n{% endfor %}\n{{code_block_closing_tag}}\n{%- endif %}\n\nHere are the rules you should always follow to solve your task:\n1. Always provide a 'Thought:' sequence, and a '{{code_block_opening_tag}}' sequence ending with '{{code_block_closing_tag}}', else you will fail.\n2. Use only variables that you have defined!\n3. Always use the right arguments for the tools. DO NOT pass the arguments as a dict as in 'answer = wikipedia_search({'query': \"What is the place where James Bond lives?\"})', but use the arguments directly as in 'answer = wikipedia_search(query=\"What is the place where James Bond lives?\")'.\n4. For tools WITHOUT JSON output schema: Take care to not chain too many sequential tool calls in the same code block, as their output format is unpredictable. For instance, a call to wikipedia_search without a JSON output schema has an unpredictable return format, so do not have another tool call that depends on its output in the same block: rather output results with print() to use them in the next block.\n5. For tools WITH JSON output schema: You can confidently chain multiple tool calls and directly access structured output fields in the same code block! When a tool has a JSON output schema, you know exactly what fields and data types to expect, allowing you to write robust code that directly accesses the structured response (e.g., result['field_name']) without needing intermediate print() statements.\n6. Call a tool only when needed, and never re-do a tool call that you previously did with the exact same parameters.\n7. Don't name any new variable with the same name as a tool: for instance don't name a variable 'final_answer'.\n8. Never create any notional variables in our code, as having these in your logs will derail you from the true variables.\n9. You can use imports in your code, but only from the following list of modules: {{authorized_imports}}\n10. The state persists between code executions: so if in one step you've created variables or imported modules, these will all persist.\n11. Don't give up! You're in charge of solving the task, not providing directions to solve it.\n\n{%- if custom_instructions %}\n{{custom_instructions}}\n{%- endif %}\n\nNow Begin!\n```\n\n보시다시피 `\"{{ tool.description }}\"`와 같은 플레이스홀더들이 있습니다. 이것들은 에이전트를 초기화할 때 도구나 관리 에이전트에 대한 설명을 자동으로 넣어주는 역할을 합니다.\n\n따라서 `system_prompt` 매개변수에 커스텀 프롬프트를 넣어서 기본 시스템 프롬프트 템플릿을 덮어쓸 수 있습니다. 새로운 시스템 프롬프트에는 이런 플레이스홀더들을 포함할 수 있습니다:\n- 도구 설명을 삽입하려면:\n  ```\n  {%- for tool in tools.values() %}\n  - {{ tool.to_tool_calling_prompt() }}\n  {%- endfor %}\n  ```\n- 관리되는 에이전트가 있는 경우 해당 설명을 삽입하려면:\n  ```\n  {%- if managed_agents and managed_agents.values() | list %}\n  You can also give tasks to team members.\n  Calling a team member works similarly to calling a tool: provide the task description as the 'task' argument. Since this team member is a real human, be as detailed and verbose as necessary in your task description.\n  You can also include any relevant variables or context using the 'additional_args' argument.\n  Here is a list of the team members that you can call:\n  {%- for agent in managed_agents.values() %}\n  - {{ agent.name }}: {{ agent.description }}\n  {%- endfor %}\n  {%- endif %}\n  ```\n- `CodeAgent`에만 해당하며, 승인된 import 목록을 삽입하려면: `\"{{authorized_imports}}\"`\n\n그런 다음 다음과 같이 시스템 프롬프트를 변경할 수 있습니다:\n\n```py\nagent.prompt_templates[\"system_prompt\"] = agent.prompt_templates[\"system_prompt\"] + \"\\nHere you go!\"\n```\n\n이는 [`ToolCallingAgent`]에서도 작동합니다.\n\n하지만 일반적으로 다음과 같이 에이전트 초기화 시 `instructions` 매개변수를 전달하는 것이 더 간단합니다:\n```py\nagent = CodeAgent(tools=[], model=InferenceClientModel(model_id=model_id), instructions=\"Always talk like a 5 year old.\")\n```\n\n### 4. 추가 계획[[extra-planning]]\n\n일반적인 작업 단계들 중간중간에 에이전트가 추가로 계획을 세우는 단계를 넣을 수 있습니다. 이때는 도구를 사용하지 않고, LLM이 현재까지 파악한 정보를 정리하고 그 정보를 토대로 앞으로의 계획을 다시 점검하게 됩니다.\n\n```py\nfrom smolagents import load_tool, CodeAgent, InferenceClientModel, WebSearchTool\nfrom dotenv import load_dotenv\n\nload_dotenv()\n\n# Import tool from Hub\nimage_generation_tool = load_tool(\"m-ric/text-to-image\", trust_remote_code=True)\n\nsearch_tool = WebSearchTool()\n\nagent = CodeAgent(\n    tools=[search_tool, image_generation_tool],\n    model=InferenceClientModel(model_id=\"Qwen/Qwen2.5-72B-Instruct\"),\n    planning_interval=3 # This is where you activate planning!\n)\n\n# Run it!\nresult = agent.run(\n    \"How long would a cheetah at full speed take to run the length of Pont Alexandre III?\",\n)\n```"
  },
  {
    "path": "docs/source/ko/tutorials/inspect_runs.md",
    "content": "# OpenTelemetry로 실행 검사하기[[inspecting-runs-with-opentelemetry]]\n\n[[open-in-colab]]\n\n> [!TIP]\n> 에이전트 구축이 처음이라면 먼저 [에이전트 소개](../conceptual_guides/intro_agents)와 [안내서](../guided_tour)를 읽어보세요.\n\n## 에이전트 실행을 로깅하는 이유는?[[why-log-your-agent-runs?]]\n\n에이전트 실행을 디버깅하는 것은 복잡한 작업입니다.\n\n실행이 제대로 진행되었는지 확인하기 어렵습니다. 에이전트 워크플로우는 설계상 예측 불가능하기 때문입니다(만약 예측 가능했다면 일반적인 코드를 사용했을 것입니다).\n\n실행 과정을 살펴보는 것도 쉽지 않습니다. 다단계 에이전트는 콘솔을 로그로 빠르게 채우는 경향이 있으며, 대부분의 오류는 단순한 \"LLM의 실수\" 유형으로, LLM이 다음 단계에서 더 나은 코드나 도구 호출을 작성하여 스스로 교정합니다.\n\n따라서 나중에 검사하고 모니터링할 수 있도록 계측을 통해 에이전트 실행을 기록하는 것이 프로덕션 환경에서는 필수입니다!\n\n에이전트 실행을 계측하기 위해 [OpenTelemetry](https://opentelemetry.io/) 표준을 도입했습니다.\n\n즉, 계측 코드를 실행한 후 에이전트를 평소처럼 실행하면 모든 내용이 플랫폼에 자동으로 로깅됩니다. 다양한 OpenTelemetry 백엔드에서 이를 구현하는 방법의 예시를 아래에 제시합니다.\n\n플랫폼에서의 실제 모습은 다음과 같습니다.\n\n<div class=\"flex justify-center\">\n    <img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/inspect_run_phoenix.gif\"/>\n</div>\n\n## Arize AI Phoenix로 텔레메트리 설정[[setting-up-telemetry-with-arize-ai-phoenix]]\n\n먼저 필요한 패키지를 설치합니다. 여기서는 로그를 수집하고 검사하기에 좋은 솔루션인 [Arize AI의 Phoenix](https://github.com/Arize-ai/phoenix)를 설치하지만, 이 과정에는 다른 OpenTelemetry 호환 플랫폼을 활용할 수도 있습니다.\n\n```shell\npip install 'smolagents[telemetry,toolkit]'\n```\n\n다음 단계로 수집기를 백그라운드에서 실행합니다.\n\n```shell\npython -m phoenix.server.main serve\n```\n\n마지막으로 `SmolagentsInstrumentor`를 설정하여 에이전트를 추적하고 Phoenix 기본 엔드포인트로 해당 추적 데이터를 전송합니다.\n\n```python\nfrom phoenix.otel import register\nfrom openinference.instrumentation.smolagents import SmolagentsInstrumentor\n\nregister()\nSmolagentsInstrumentor().instrument()\n```\n이제 에이전트를 실행할 수 있습니다!\n\n```py\nfrom smolagents import (\n    CodeAgent,\n    ToolCallingAgent,\n    WebSearchTool,\n    VisitWebpageTool,\n    InferenceClientModel,\n)\n\nmodel = InferenceClientModel()\n\nsearch_agent = ToolCallingAgent(\n    tools=[WebSearchTool(), VisitWebpageTool()],\n    model=model,\n    name=\"search_agent\",\n    description=\"This is an agent that can do web search.\",\n)\n\nmanager_agent = CodeAgent(\n    tools=[],\n    model=model,\n    managed_agents=[search_agent],\n)\nmanager_agent.run(\n    \"If the US keeps its 2024 growth rate, how many years will it take for the GDP to double?\"\n)\n```\n끝입니다!\n이제 `http://0.0.0.0:6006/projects/`로 이동하여 실행 결과를 확인할 수 있습니다!\n\n<img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/inspect_run_phoenix.png\">\n\nCodeAgent가 관리하는 ToolCallingAgent를 호출하여(참고로 관리되는 에이전트는 CodeAgent가 될 수도 있습니다) 미국 2024년 성장률을 웹에서 검색하도록 요청한 것을 확인할 수 있습니다. 이후 관리되는 에이전트가 결과를 보고하면, 관리자 에이전트가 이 정보를 활용하여 경제 배증 시간을 계산했습니다! 흥미롭죠?\n\n## 🪢 Langfuse로 텔레메트리 설정[[setting-up-telemetry-with-🪢-langfuse]]\n\n이 부분은 `SmolagentsInstrumentor`를 사용하여 **Langfuse**로 Hugging Face **smolagents**를 모니터링하고 디버깅하는 방법을 보여줍니다.\n\n> **Langfuse란?** [Langfuse](https://langfuse.com)는 LLM 엔지니어링을 위한 오픈소스 플랫폼입니다. AI 에이전트를 위한 추적 및 모니터링 기능을 제공하여 개발자가 제품을 디버깅하고, 분석하고, 최적화할 수 있도록 도와줍니다. Langfuse는 네이티브 통합, OpenTelemetry, SDK를 통해 다양한 도구와 프레임워크와 통합됩니다.\n\n### 1단계: 의존성 설치[[step-1:-install-dependencies]]\n\n```python\n%pip install langfuse 'smolagents[telemetry]' openinference-instrumentation-smolagents\n```\n\n### 2단계: 환경 변수 설정[[step-2:-set-up-environment-variables]]\n\nLangfuse API 키를 설정하고 Langfuse로 추적을 보내도록 OpenTelemetry 엔드포인트를 구성하세요. [Langfuse Cloud](https://cloud.langfuse.com)에 가입하거나 [Langfuse를 자체 호스팅](https://langfuse.com/self-hosting)하여 Langfuse API 키를 얻으세요.\n\n또한 [Hugging Face 토큰](https://huggingface.co/settings/tokens) (`HF_TOKEN`)을 환경 변수로 추가하세요.\n\n```python\nimport os\n# 프로젝트 설정 페이지(https://cloud.langfuse.com)에서 프로젝트 키를 가져옵니다. \nos.environ[\"LANGFUSE_PUBLIC_KEY\"] = \"pk-lf-...\" \nos.environ[\"LANGFUSE_SECRET_KEY\"] = \"sk-lf-...\" \nos.environ[\"LANGFUSE_HOST\"] = \"https://cloud.langfuse.com\" # 🇪🇺 유럽 지역\n# os.environ[\"LANGFUSE_HOST\"] = \"https://us.cloud.langfuse.com\" # 🇺🇸 미국 지역\n \n# Hugging Face 토큰을 입력합니다.\nos.environ[\"HF_TOKEN\"] = \"hf_...\"\n```\n\n환경 변수가 설정되면 이제 Langfuse 클라이언트를 초기화할 수 있습니다. `get_client()`는 환경 변수에 제공된 자격 증명을 사용하여 Langfuse 클라이언트를 초기화합니다.\n\n```python\nfrom langfuse import get_client\n \nlangfuse = get_client()\n \n# 연결을 확인합니다.\nif langfuse.auth_check():\n    print(\"Langfuse client is authenticated and ready!\")\nelse:\n    print(\"Authentication failed. Please check your credentials and host.\")\n```\n\n### 3단계: `SmolagentsInstrumentor` 초기화[[step-3:-initialize-the-`smolagentsinstrumentor`]]\n\n애플리케이션 코드를 실행하기 전에 `SmolagentsInstrumentor`를 초기화하세요.\n\n```python\nfrom openinference.instrumentation.smolagents import SmolagentsInstrumentor\n \nSmolagentsInstrumentor().instrument()\n```\n\n### 4단계: smolagent 실행[[step-4:-run-your-smolagent]]\n\n```python\nfrom smolagents import (\n    CodeAgent,\n    ToolCallingAgent,\n    WebSearchTool,\n    VisitWebpageTool,\n    InferenceClientModel,\n)\n\nmodel = InferenceClientModel(\n    model_id=\"deepseek-ai/DeepSeek-R1-Distill-Qwen-32B\"\n)\n\nsearch_agent = ToolCallingAgent(\n    tools=[WebSearchTool(), VisitWebpageTool()],\n    model=model,\n    name=\"search_agent\",\n    description=\"This is an agent that can do web search.\",\n)\n\nmanager_agent = CodeAgent(\n    tools=[],\n    model=model,\n    managed_agents=[search_agent],\n)\nmanager_agent.run(\n    \"How can Langfuse be used to monitor and improve the reasoning and decision-making of smolagents when they execute multi-step tasks, like dynamically adjusting a recipe based on user feedback or available ingredients?\"\n)\n```\n\n### 5단계: Langfuse에서 추적 보기[[step-5:-view-traces-in-langfuse]]\n\n에이전트를 실행한 후, Langfuse의 smolagents 애플리케이션에서 생성된 추적 정보를 확인할 수 있습니다. AI 에이전트의 디버깅과 최적화에 도움이 되는 LLM 상호작용의 상세한 세부 과정을 살펴볼 수 있습니다.\n\n![smolagents example trace](https://langfuse.com/images/cookbook/integration-smolagents/smolagent_example_trace.png)\n\n_[Langfuse의 추적 예시](https://cloud.langfuse.com/project/cloramnkj0002jz088vzn1ja4/traces/ce5160f9bfd5a6cd63b07d2bfcec6f54?timestamp=2025-02-11T09%3A25%3A45.163Z&display=details)_\n"
  },
  {
    "path": "docs/source/ko/tutorials/memory.md",
    "content": "# 📚 에이전트 메모리 관리[[-manage-your-agents-memory]]\n\n[[open-in-colab]]\n\n결국 에이전트는 도구와 프롬프트로 이루어진 단순한 구성요소로 정의됩니다.\n그리고 무엇보다 중요한 것은 에이전트가 과거 단계의 메모리를 가지고 있어 계획, 실행, 오류의 이력을 추적한다는 점입니다.\n\n### 에이전트 메모리 재생[[replay-your-agents-memory]]\n\n과거 실행된 에이전트를 확인하기 위한 몇 가지 기능을 제공합니다.\n\n[계측 가이드](./inspect_runs)에서 언급한 바와 같이, 에이전트 실행을 계측하여 특정 단계를 확대하거나 축소할 수 있는 우수한 UI로 시각화할 수 있습니다.\n\n또한 다음과 같이 `agent.replay()`를 사용할 수도 있습니다.\n\n에이전트를 실행한 후,\n```py\nfrom smolagents import InferenceClientModel, CodeAgent\n\nagent = CodeAgent(tools=[], model=InferenceClientModel(), verbosity_level=0)\n\nresult = agent.run(\"What's the 20th Fibonacci number?\")\n```\n\n이 마지막 실행을 다시 재생하고 싶다면, 다음 코드를 사용하면 됩니다.\n```py\nagent.replay()\n```\n\n### 에이전트 메모리 동적 변경[[dynamically-change-the-agents-memory]]\n\n많은 고급 사용 사례에서는 에이전트의 메모리를 동적으로 수정해야 합니다.\n\n에이전트의 메모리는 다음과 같이 접근할 수 있습니다.\n\n\n```py\nfrom smolagents import ActionStep\n\nsystem_prompt_step = agent.memory.system_prompt\nprint(\"The system prompt given to the agent was:\")\nprint(system_prompt_step.system_prompt)\n\ntask_step = agent.memory.steps[0]\nprint(\"\\n\\nThe first task step was:\")\nprint(task_step.task)\n\nfor step in agent.memory.steps:\n    if isinstance(step, ActionStep):\n        if step.error is not None:\n            print(f\"\\nStep {step.step_number} got this error:\\n{step.error}\\n\")\n        else:\n            print(f\"\\nStep {step.step_number} got these observations:\\n{step.observations}\\n\")\n```\n\n`agent.memory.get_full_steps()`를 사용하여 전체 단계를 딕셔너리 형태로 가져올 수 있습니다.\n\n또한 단계 콜백을 사용하여 에이전트의 메모리를 동적으로 변경할 수도 있습니다.\n\n단계 콜백은 인자로 `agent` 객체 자체에 접근할 수 있으므로, 위에서 설명한 것처럼 모든 메모리 단계에 접근하여 필요한 경우 수정할 수 있습니다. 예를 들어, 웹 브라우저 에이전트가 수행하는 각 단계의 스크린샷을 관찰하고 있다고 가정해 보겠습니다. 이 경우 최신 스크린샷은 유지하면서 토큰 비용을 절약하기 위해 이전 단계의 이미지를 메모리에서 제거할 수 있습니다.\n\n이 경우 다음과 같은 코드를 사용할 수 있습니다.\n_주의: 이 코드는 간결함을 위해 일부 임포트 및 객체 정의가 생략된 불완전한 예시입니다. 전체 작동 버전의 코드는 [원본 스크립트](https://github.com/huggingface/smolagents/blob/main/src/smolagents/vision_web_browser.py)에서 확인하세요._\n\n```py\nimport helium\nfrom PIL import Image\nfrom io import BytesIO\nfrom time import sleep\n\ndef update_screenshot(memory_step: ActionStep, agent: CodeAgent) -> None:\n    sleep(1.0)  # JavaScript 애니메이션이 완료된 후에 스크린샷을 찍도록 합니다.\n    driver = helium.get_driver()\n    latest_step = memory_step.step_number\n    for previous_memory_step in agent.memory.steps:  # 이전 스크린샷을 로그에서 제거하여 처리 과정을 간소화합니다.\n        if isinstance(previous_memory_step, ActionStep) and previous_memory_step.step_number <= latest_step - 2:\n            previous_memory_step.observations_images = None\n    png_bytes = driver.get_screenshot_as_png()\n    image = Image.open(BytesIO(png_bytes))\n    memory_step.observations_images = [image.copy()]\n```\n\n그 다음 에이전트를 초기화할 때 이 함수를 다음과 같이 `step_callbacks` 인수에 전달해야 합니다.\n\n```py\nCodeAgent(\n    tools=[WebSearchTool(), go_back, close_popups, search_item_ctrl_f],\n    model=model,\n    additional_authorized_imports=[\"helium\"],\n    step_callbacks=[update_screenshot],\n    max_steps=20,\n    verbosity_level=2,\n)\n```\n\n전체 작동 예시는 [비전 웹 브라우저 코드](https://github.com/huggingface/smolagents/blob/main/src/smolagents/vision_web_browser.py)에서 확인할 수 있습니다.\n\n### 에이전트를 단계별로 실행[[run-agents-one-step-at-a-time]]\n\n이 기능은 도구 호출에 오랜 시간이 걸리는 경우에 유용합니다.\n에이전트를 한 단계씩 실행하면서 각 단계에서 메모리를 업데이트할 수 있습니다.\n\n```py\nfrom smolagents import InferenceClientModel, CodeAgent, ActionStep, TaskStep\n\nagent = CodeAgent(tools=[], model=InferenceClientModel(), verbosity_level=1)\nagent.python_executor.send_tools({**agent.tools})\nprint(agent.memory.system_prompt)\n\ntask = \"What is the 20th Fibonacci number?\"\n\n# 필요에 따라 다른 에이전트의 메모리를 불러와 메모리를 수정할 수 있습니다.\n# agent.memory.steps = previous_agent.memory.steps\n\n# 새로운 작업을 시작합니다!\nagent.memory.steps.append(TaskStep(task=task, task_images=[]))\n\nfinal_answer = None\nstep_number = 1\nwhile final_answer is None and step_number <= 10:\n    memory_step = ActionStep(\n        step_number=step_number,\n        observations_images=[],\n    )\n    # 한 단계를 실행합니다.\n    final_answer = agent.step(memory_step)\n    agent.memory.steps.append(memory_step)\n    step_number += 1\n\n    # 필요한 경우 메모리를 수정할 수도 있습니다\n    # 예를 들어 최신 단계를 업데이트 하려면 다음과 같이 처리합니다:\n    # agent.memory.steps[-1] = ...\n\nprint(\"The final answer is:\", final_answer)\n```\n"
  },
  {
    "path": "docs/source/zh/_config.py",
    "content": "# docstyle-ignore\nINSTALL_CONTENT = \"\"\"\n# Installation\n! pip install smolagents\n# To install from source instead of the last release, comment the command above and uncomment the following one.\n# ! pip install git+https://github.com/huggingface/smolagents.git\n\"\"\"\n\nnotebook_first_cells = [{\"type\": \"code\", \"content\": INSTALL_CONTENT}]\nblack_avoid_patterns = {\n    \"{processor_class}\": \"FakeProcessorClass\",\n    \"{model_class}\": \"FakeModelClass\",\n    \"{object_class}\": \"FakeObjectClass\",\n}\n"
  },
  {
    "path": "docs/source/zh/_toctree.yml",
    "content": "- title: 起步\n  sections:\n  - local: index\n    title: 🤗 Agents\n  - local: guided_tour\n    title: 导览\n- title: Tutorials\n  sections:\n  - local: tutorials/building_good_agents\n    title: ✨ 构建好用的 agents\n  - local: tutorials/inspect_runs\n    title: 📊 监控 Agent 的运行\n  - local: tutorials/tools\n    title: 🛠️ 工具 - 深度指南\n  - local: tutorials/secure_code_execution\n    title: 🛡️ 使用 E2B 保护你的代码执行\n  - local: tutorials/memory\n    title: 📚 管理 Agent 的记忆\n- title: Conceptual guides\n  sections:\n  - local: conceptual_guides/intro_agents\n    title: 🤖 Agent 化系统介绍\n  - local: conceptual_guides/react\n    title: 🤔 多步骤 Agent 是如何工作的？\n- title: Examples\n  sections:\n  - local: examples/text_to_sql\n    title: 自我修正 Text-to-SQL\n  - local: examples/rag\n    title: 借助 agentic RAG 掌控知识库\n  - local: examples/multiagents\n    title: 编排 multi-agent 系统\n  - local: examples/web_browser\n    title: 基于视觉模型构建能够浏览网页的agent\n- title: Reference\n  sections:\n  - local: reference/agents\n    title: Agent-related objects\n  - local: reference/models\n    title: Model-related objects\n  - local: reference/tools\n    title: Tool-related objects\n"
  },
  {
    "path": "docs/source/zh/conceptual_guides/intro_agents.md",
    "content": "# Agent 简介\n\n> [!TIP]\n> 译者注：Agent 的业内术语是“智能体”。本译文将保留 agent，不作翻译，以带来更高效的阅读体验。(在中文为主的文章中，It's easier to 注意到英文。Attention Is All You Need!)\n\n## 🤔 什么是 agent？\n\n任何使用 AI 的高效系统都需要为 LLM 提供某种访问现实世界的方式：例如调用搜索工具获取外部信息，或者操作某些程序以完成任务。换句话说，LLM 应该具有 **_Agent 能力_**。Agent 程序是 LLM 通往外部世界的门户。\n\n> [!TIP]\n> AI agent 是 **LLM 输出控制工作流的程序**。\n\n任何利用 LLM 的系统都会将 LLM 输出集成到代码中。LLM 输入对代码工作流的影响程度就是 LLM 在系统中的 agent 能力级别。\n\n请注意，根据这个定义，\"Agent\" 不是一个离散的、非 0 即 1 的定义：相反，\"Agent 能力\" 是一个连续谱系，随着你在工作流中给予 LLM 更多或更少的权力而变化。\n\n请参见下表中 agent 能力在不同系统中的变化：\n\n| Agent 能力级别 | 描述                                           | 名称       | 示例模式                                           |\n| ------------ | ---------------------------------------------- | ---------- | -------------------------------------------------- |\n| ☆☆☆          | LLM 输出对程序流程没有影响                     | 简单处理器 | `process_llm_output(llm_response)`                 |\n| ★☆☆          | LLM 输出决定 if/else 分支                      | 路由       | `if llm_decision(): path_a() else: path_b()`       |\n| ★★☆          | LLM 输出决定函数执行                           | 工具调用者 | `run_function(llm_chosen_tool, llm_chosen_args)`   |\n| ★★★          | LLM 输出控制迭代和程序继续                     | 多步 Agent | `while llm_should_continue(): execute_next_step()` |\n| ★★★          | 一个 agent 工作流可以启动另一个 agent 工作流 | 多 Agent   | `if llm_trigger(): execute_agent()`                |\n\n多步 agent 具有以下代码结构：\n\n```python\nmemory = [user_defined_task]\nwhile llm_should_continue(memory): # 这个循环是多步部分\n    action = llm_get_next_action(memory) # 这是工具调用部分\n    observations = execute_action(action)\n    memory += [action, observations]\n```\n\n这个 agent 系统在一个循环中运行，每一步执行一个新动作（该动作可能涉及调用一些预定义的 *工具*，这些工具只是函数），直到其观察结果表明已达到解决给定任务的满意状态。以下是一个多步 agent 如何解决简单数学问题的示例：\n\n<div class=\"flex justify-center\">\n    <img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Agent_ManimCE.gif\"/>\n</div>\n\n## ✅ 何时使用 agent / ⛔ 何时避免使用\n\n当你需要 LLM 确定应用程序的工作流时，agent 很有用。但它们通常有些过度。问题是：我真的需要工作流的灵活性来有效解决手头的任务吗？\n如果预定义的工作流经常不足，这意味着你需要更多的灵活性。\n让我们举个例子：假设你正在开发一个处理冲浪旅行网站客户请求的应用程序。\n\n你可以提前知道请求将属于 2 个类别之一（基于用户选择），并且你为这 2 种情况都有预定义的工作流。\n\n1. 想要了解旅行信息？⇒ 给他们访问搜索栏以搜索你的知识库\n2. 想与销售交谈？⇒ 让他们填写联系表单。\n\n如果这个确定性工作流适合所有查询，那就直接编码吧！这将为你提供一个 100% 可靠的系统，没有让不可预测的 LLM 干扰你的工作流而引入错误的风险。为了简单和稳健起见，建议规范化不使用任何 agent 行为。\n\n但如果工作流不能提前确定得那么好呢？\n\n例如，用户想问：`\"I can come on Monday, but I forgot my passport so risk being delayed to Wednesday, is it possible to take me and my stuff to surf on Tuesday morning, with a cancellation insurance?\"` 这个问题涉及许多因素，可能上述预定的标准都不足以满足这个请求。\n\n如果预定义的工作流经常不足，这意味着你需要更多的灵活性。\n\n这就是 agent 设置发挥作用的地方。\n\n在上面的例子中，你可以创建一个多步 agent，它可以访问天气 API 获取天气预报，Google Maps API 计算旅行距离，员工在线仪表板和你的知识库上的 RAG 系统。\n\n直到最近，计算机程序还局限于预定义的工作流，试图通过堆积 if/else 分支来处理复杂性。它们专注于极其狭窄的任务，如\"计算这些数字的总和\"或\"找到这个图中的最短路径\"。但实际上，大多数现实生活中的任务，如我们上面的旅行示例，都不适合预定义的工作流。agent 系统为程序打开了现实世界任务的大门！\n\n## 为什么选择 `smolagents`？\n\n对于一些低级的 agent 用例，如链或路由器，你可以自己编写所有代码。这样会更好，因为它可以让你更好地控制和理解你的系统。\n\n但一旦你开始追求更复杂的行为，比如让 LLM 调用函数（即\"工具调用\"）或让 LLM 运行 while 循环（\"多步 agent\"），一些抽象就变得必要：\n\n- 对于工具调用，你需要解析 agent 的输出，因此这个输出需要一个预定义的格式，如\"Thought: I should call tool 'get_weather'. Action: get_weather(Paris).\"，你用预定义的函数解析它，并且给 LLM 的系统提示应该通知它这个格式。\n- 对于 LLM 输出决定循环的多步 agent，你需要根据上次循环迭代中发生的情况给 LLM 不同的提示：所以你需要某种记忆能力。\n\n看到了吗？通过这两个例子，我们已经发现需要一些项目来帮助我们：\n\n- 当然，一个作为系统引擎的 LLM\n- agent 可以访问的工具列表\n- 从 LLM 输出中提取工具调用的解析器\n- 与解析器同步的系统提示\n- 记忆能力\n\n但是等等，既然我们给 LLM 在决策中留出了空间，它们肯定会犯错误：所以我们需要错误日志记录和重试机制。\n\n所有这些元素都需要紧密耦合才能形成一个功能良好的系统。这就是为什么我们决定需要制作基本构建块来让所有这些东西协同工作。\n\n## 代码 agent\n\n在多步 agent 中，每一步 LLM 都可以编写一个动作，形式为调用外部工具。编写这些动作的常见格式（由 Anthropic、OpenAI 等使用）通常是\"将动作编写为工具名称和要使用的参数的 JSON，然后解析以知道要执行哪个工具以及使用哪些参数\"的不同变体。\n\n[多项](https://huggingface.co/papers/2402.01030) [研究](https://huggingface.co/papers/2411.01747) [论文](https://huggingface.co/papers/2401.00812) 表明，在代码中进行工具调用的 LLM 要好得多。\n\n原因很简单，_我们专门设计了我们的代码语言，使其成为表达计算机执行动作的最佳方式_。如果 JSON 片段是更好的表达方式，JSON 将成为顶级编程语言，编程将变得非常困难。\n\n下图取自 [Executable Code Actions Elicit Better LLM Agents](https://huggingface.co/papers/2402.01030)，说明了用代码编写动作的一些优势：\n\n<img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/code_vs_json_actions.png\">\n\n与 JSON 片段相比，用代码编写动作提供了更好的：\n\n- **可组合性：** 你能像定义 python 函数一样，将 JSON 动作嵌套在一起，或定义一组 JSON 动作以供重用吗？\n- **对象管理：** 你如何在 JSON 中存储像 `generate_image` 这样的动作的输出？\n- **通用性：** 代码被构建为简单地表达任何你可以让计算机做的事情。\n- **LLM 训练数据中的表示：** 大量高质量的代码动作已经包含在 LLM 的训练数据中，这意味着它们已经为此进行了训练！\n"
  },
  {
    "path": "docs/source/zh/conceptual_guides/react.md",
    "content": "# 多步骤 agent 是如何工作的？\n\nReAct 框架（[Yao et al., 2022](https://huggingface.co/papers/2210.03629)）是目前构建 agent 的主要方法。\n\n该名称基于两个词的组合：\"Reason\" （推理）和 \"Act\" （行动）。实际上，遵循此架构的 agent 将根据需要尽可能多的步骤来解决其任务，每个步骤包括一个推理步骤，然后是一个行动步骤，在该步骤中，它制定工具调用，使其更接近解决手头的任务。\n\nReAct 过程涉及保留过去步骤的记忆。\n\n> [!TIP]\n> 阅读 [Open-source LLMs as LangChain Agents](https://huggingface.co/blog/open-source-llms-as-agents) 博客文章以了解更多关于多步 agent 的信息。\n\n以下是其工作原理的视频概述：\n\n<div class=\"flex justify-center\">\n    <img\n        class=\"block dark:hidden\"\n        src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Agent_ManimCE.gif\"\n    />\n    <img\n        class=\"hidden dark:block\"\n        src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Agent_ManimCE.gif\"\n    />\n</div>\n\n![ReAct agent 的框架](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/open-source-llms-as-agents/ReAct.png)\n\n我们实现了两个版本的 ToolCallingAgent：\n- [`ToolCallingAgent`] 在其输出中生成 JSON 格式的工具调用。\n- [`CodeAgent`] 是一种新型的 ToolCallingAgent，它生成代码块形式的工具调用，这对于具有强大编码性能的 LLM 非常有效。\n"
  },
  {
    "path": "docs/source/zh/examples/multiagents.md",
    "content": "# 编排 multi-agent 系统 🤖🤝🤖\n\n[[open-in-colab]]\n\n此notebook将构建一个 **multi-agent 网络浏览器：一个有多个代理协作，使用网络进行搜索解决问题的代理系统**\n\n`ManagedAgent` 对象将封装这些管理网络搜索的agent，形成一个简单的层次结构：\n\n```\n              +----------------+\n              | Manager agent  |\n              +----------------+\n                       |\n        _______________|______________\n       |                              |\n  Code interpreter   +--------------------------------+\n       tool          |         Managed agent          |\n                     |      +------------------+      |\n                     |      | Web Search agent |      |\n                     |      +------------------+      |\n                     |         |            |         |\n                     |  Web Search tool     |         |\n                     |             Visit webpage tool |\n                     +--------------------------------+\n```\n我们来一起构建这个系统。运行下列代码以安装依赖包：\n\n```\n!pip install 'smolagents[toolkit]' --upgrade -q\n```\n\n我们需要登录Hugging Face Hub以调用HF的Inference API：\n\n```\nfrom huggingface_hub import login\n\nlogin()\n```\n\n⚡️ HF的Inference API 可以快速轻松地运行任何开源模型，因此我们的agent将使用HF的Inference API\n中的`InferenceClientModel`类来调用\n[Qwen/Qwen3-Next-80B-A3B-Thinking](https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Thinking)模型。\n\n_Note:_ 基于多参数和部署模型的 Inference API 可能在没有预先通知的情况下更新或替换模型。了解更多信息，请参阅[这里](https://huggingface.co/docs/api-inference/supported-models)。\n\n```py\nmodel_id = \"Qwen/Qwen3-Next-80B-A3B-Thinking\"\n```\n\n## 🔍 创建网络搜索工具\n\n虽然我们可以使用已经存在的\n[`WebSearchTool`]\n工具作为谷歌搜索的平替进行网页浏览，然后我们也需要能够查看`WebSearchTool`找到的页面。为此，我\n们可以直接导入库的内置\n`VisitWebpageTool`。但是我们将重新构建它以了解其工作原理。\n\n我们将使用`markdownify` 来从头构建我们的`VisitWebpageTool`工具。\n\n```py\nimport re\nimport requests\nfrom markdownify import markdownify\nfrom requests.exceptions import RequestException\nfrom smolagents import tool\n\n\n@tool\ndef visit_webpage(url: str) -> str:\n    \"\"\"Visits a webpage at the given URL and returns its content as a markdown string.\n\n    Args:\n        url: The URL of the webpage to visit.\n\n    Returns:\n        The content of the webpage converted to Markdown, or an error message if the request fails.\n    \"\"\"\n    try:\n        # Send a GET request to the URL\n        response = requests.get(url)\n        response.raise_for_status()  # Raise an exception for bad status codes\n\n        # Convert the HTML content to Markdown\n        markdown_content = markdownify(response.text).strip()\n\n        # Remove multiple line breaks\n        markdown_content = re.sub(r\"\\n{3,}\", \"\\n\\n\", markdown_content)\n\n        return markdown_content\n\n    except RequestException as e:\n        return f\"Error fetching the webpage: {str(e)}\"\n    except Exception as e:\n        return f\"An unexpected error occurred: {str(e)}\"\n```\n\n现在我们初始化这个工具并测试它！\n\n```py\nprint(visit_webpage(\"https://en.wikipedia.org/wiki/Hugging_Face\")[:500])\n```\n\n## 构建我们的 multi-agent 系统 🤖🤝🤖\n\n现在我们有了所有工具`search`和`visit_webpage`，我们可以使用它们来创建web agent。\n\n我们该选取什么样的配置来构建这个agent呢？\n- 网页浏览是一个单线程任务，不需要并行工具调用，因此JSON工具调用对于这个任务非常有效。因此我们选择`ToolCallingAgent`。\n- 有时候网页搜索需要探索许多页面才能找到正确答案，所以我们更喜欢将 `max_steps` 增加到10。\n\n```py\nfrom smolagents import (\n    CodeAgent,\n    ToolCallingAgent,\n    InferenceClientModel,\n    ManagedAgent,\n    WebSearchTool,\n)\n\nmodel = InferenceClientModel(model_id=model_id)\n\nweb_agent = ToolCallingAgent(\n    tools=[WebSearchTool(), visit_webpage],\n    model=model,\n    max_steps=10,\n    name=\"search\",\n    description=\"Runs web searches for you. Give it your query as an argument.\",\n)\n```\n\n请注意，我们为这个代理赋予了 name（名称）和 description（描述）属性，这些是必需属性，以便让管理代理能够调用此代理。\n\n然后，我们创建一个管理代理，在初始化时，将受管代理作为 managed_agents 参数传递给它。\n\n由于这个代理的任务是进行规划和思考，高级推理能力会很有帮助，因此 CodeAgent（代码代理）将是最佳选择。\n\n此外，我们要提出一个涉及当前年份并需要进行额外数据计算的问题：所以让我们添加 additional_authorized_imports=[\"time\", \"numpy\", \"pandas\"]，以防代理需要用到这些包。\n\n```py\nmanager_agent = CodeAgent(\n    tools=[],\n    model=model,\n    managed_agents=[web_agent],\n    additional_authorized_imports=[\"time\", \"numpy\", \"pandas\"],\n)\n```\n\n可以了！现在让我们运行我们的系统！我们选择一个需要一些计算和研究的问题：\n\n```py\nanswer = manager_agent.run(\"If LLM training continues to scale up at the current rhythm until 2030, what would be the electric power in GW required to power the biggest training runs by 2030? What would that correspond to, compared to some countries? Please provide a source for any numbers used.\")\n```\n\n我们用这个report 来回答这个问题：\n```\nBased on current growth projections and energy consumption estimates, if LLM trainings continue to scale up at the\ncurrent rhythm until 2030:\n\n1. The electric power required to power the biggest training runs by 2030 would be approximately 303.74 GW, which\ntranslates to about 2,660,762 GWh/year.\n\n1. Comparing this to countries' electricity consumption:\n   - It would be equivalent to about 34% of China's total electricity consumption.\n   - It would exceed the total electricity consumption of India (184%), Russia (267%), and Japan (291%).\n   - It would be nearly 9 times the electricity consumption of countries like Italy or Mexico.\n\n2. Source of numbers:\n   - The initial estimate of 5 GW for future LLM training comes from AWS CEO Matt Garman.\n   - The growth projection used a CAGR of 79.80% from market research by Springs.\n   - Country electricity consumption data is from the U.S. Energy Information Administration, primarily for the year\n2021.\n```\n\n如果[scaling hypothesis](https://gwern.net/scaling-hypothesis)持续成立的话，我们需要一些庞大的动力配置。我们的agent成功地协作解决了这个任务！✅\n\n💡 你可以轻松地将这个编排扩展到更多的agent：一个执行代码，一个进行网页搜索，一个处理文件加载⋯⋯\n"
  },
  {
    "path": "docs/source/zh/examples/rag.md",
    "content": "# Agentic RAG\n\n[[open-in-colab]]\n\nRetrieval-Augmented-Generation (RAG) 是“使用大语言模型（LLM）来回答用户查询，但基于从知识库中检索的信息”。它比使用普通或微调的 LLM 具有许多优势：举几个例子，它允许将答案基于真实事实并减少虚构；它允许提供 LLM 领域特定的知识；并允许对知识库中的信息访问进行精细控制。\n\n但是，普通的 RAG 存在一些局限性，以下两点尤为突出：\n\n- 它只执行一次检索步骤：如果结果不好，生成的内容也会不好。\n- 语义相似性是以用户查询为参考计算的，这可能不是最优的：例如，用户查询通常是一个问题，而包含真实答案的文档通常是肯定语态，因此其相似性得分会比其他以疑问形式呈现的源文档低，从而导致错失相关信息的风险。\n\n我们可以通过制作一个 RAG  agent来缓解这些问题：非常简单，一个配备了检索工具的agent！这个 agent 将\n会：✅ 自己构建查询和检索，✅ 如果需要的话会重新检索。\n\n因此，它将比普通 RAG 更智能，因为它可以自己构建查询，而不是直接使用用户查询作为参考。这样，它可以更\n接近目标文档，从而提高检索的准确性， [HyDE](https://huggingface.co/papers/2212.10496)。此 agent 可以\n使用生成的片段，并在需要时重新检索，就像 [Self-Query](https://docs.llamaindex.ai/en/stable/examples/evaluation/RetryQuery/)。\n\n我们现在开始构建这个系统. 🛠️\n\n运行以下代码以安装所需的依赖包：\n```bash\n!pip install smolagents pandas langchain langchain-community sentence-transformers rank_bm25 --upgrade -q\n```\n\n你需要一个有效的 token 作为环境变量 `HF_TOKEN` 来调用 Inference Providers。我们使用 python-dotenv 来加载它。\n```py\nfrom dotenv import load_dotenv\nload_dotenv()\n```\n\n我们首先加载一个知识库以在其上执行 RAG：此数据集是许多 Hugging Face 库的文档页面的汇编，存储为 markdown 格式。我们将仅保留 `transformers` 库的文档。然后通过处理数据集并将其存储到向量数据库中，为检索器准备知识库。我们将使用 [LangChain](https://python.langchain.com/docs/introduction/) 来利用其出色的向量数据库工具。\n```py\nimport datasets\nfrom langchain.docstore.document import Document\nfrom langchain.text_splitter import RecursiveCharacterTextSplitter\nfrom langchain_community.retrievers import BM25Retriever\n\nknowledge_base = datasets.load_dataset(\"m-ric/huggingface_doc\", split=\"train\")\nknowledge_base = knowledge_base.filter(lambda row: row[\"source\"].startswith(\"huggingface/transformers\"))\n\nsource_docs = [\n    Document(page_content=doc[\"text\"], metadata={\"source\": doc[\"source\"].split(\"/\")[1]})\n    for doc in knowledge_base\n]\n\ntext_splitter = RecursiveCharacterTextSplitter(\n    chunk_size=500,\n    chunk_overlap=50,\n    add_start_index=True,\n    strip_whitespace=True,\n    separators=[\"\\n\\n\", \"\\n\", \".\", \" \", \"\"],\n)\ndocs_processed = text_splitter.split_documents(source_docs)\n```\n\n现在文档已准备好。我们来一起构建我们的 agent RAG 系统！\n👉 我们只需要一个 RetrieverTool，我们的 agent 可以利用它从知识库中检索信息。\n\n由于我们需要将 vectordb 添加为工具的属性，我们不能简单地使用带有 `@tool` 装饰器的简单工具构造函数：因此我们将遵循 [tools 教程](../tutorials/tools) 中突出显示的高级设置。\n\n```py\nfrom smolagents import Tool\n\nclass RetrieverTool(Tool):\n    name = \"retriever\"\n    description = \"Uses semantic search to retrieve the parts of transformers documentation that could be most relevant to answer your query.\"\n    inputs = {\n        \"query\": {\n            \"type\": \"string\",\n            \"description\": \"The query to perform. This should be semantically close to your target documents. Use the affirmative form rather than a question.\",\n        }\n    }\n    output_type = \"string\"\n\n    def __init__(self, docs, **kwargs):\n        super().__init__(**kwargs)\n        self.retriever = BM25Retriever.from_documents(\n            docs, k=10\n        )\n\n    def forward(self, query: str) -> str:\n        assert isinstance(query, str), \"Your search query must be a string\"\n\n        docs = self.retriever.invoke(\n            query,\n        )\n        return \"\\nRetrieved documents:\\n\" + \"\".join(\n            [\n                f\"\\n\\n===== Document {str(i)} =====\\n\" + doc.page_content\n                for i, doc in enumerate(docs)\n            ]\n        )\n\nretriever_tool = RetrieverTool(docs_processed)\n```\nBM25 检索方法是一个经典的检索方法，因为它的设置速度非常快。为了提高检索准确性，你可以使用语义搜索，使用文档的向量表示替换 BM25：因此你可以前往 [MTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) 选择一个好的嵌入模型。\n\n现在我们已经创建了一个可以从知识库中检索信息的工具，现在我们可以很容易地创建一个利用这个\n`retriever_tool` 的 agent！此 agent 将使用如下参数初始化：\n- `tools`：代理将能够调用的工具列表。\n- `model`：为代理提供动力的 LLM。\n\n我们的 `model` 必须是一个可调用对象，它接受一个消息的 list 作为输入，并返回文本。它还需要接受一个 stop_sequences 参数，指示何时停止生成。为了方便起见，我们直接使用包中提供的 `HfEngine` 类来获取调用 Hugging Face 的 Inference API 的 LLM 引擎。\n\n接着，我们将使用 [meta-llama/Llama-3.3-70B-Instruct](meta-llama/Llama-3.3-70B-Instruct) 作为 llm 引\n擎，因为：\n- 它有一个长 128k 上下文，这对处理长源文档很有用。\n- 它在 HF 的 Inference API 上始终免费提供！\n\n_Note:_ 此 Inference API 托管基于各种标准的模型，部署的模型可能会在没有事先通知的情况下进行更新或替换。了解更多信息，请点击[这里](https://huggingface.co/docs/api-inference/supported-models)。\n\n```py\nfrom smolagents import InferenceClientModel, CodeAgent\n\nagent = CodeAgent(\n    tools=[retriever_tool], model=InferenceClientModel(model_id=\"meta-llama/Llama-3.3-70B-Instruct\"), max_steps=4, verbose=True\n)\n```\n\n当我们初始化 CodeAgent 时，它已经自动获得了一个默认的系统提示，告诉 LLM 引擎按步骤处理并生成工具调用作为代码片段，但你可以根据需要替换此提示模板。接着，当其 `.run()` 方法被调用时，代理将负责调用 LLM 引擎，并在循环中执行工具调用，直到工具 `final_answer` 被调用，而其参数为最终答案。\n\n```py\nagent_output = agent.run(\"For a transformers model training, which is slower, the forward or the backward pass?\")\n\nprint(\"Final output:\")\nprint(agent_output)\n```\n"
  },
  {
    "path": "docs/source/zh/examples/text_to_sql.md",
    "content": "# Text-to-SQL\n\n[[open-in-colab]]\n\n在此教程中，我们将看到如何使用 `smolagents` 实现一个利用 SQL 的 agent。\n\n> 让我们从经典问题开始：为什么不简单地使用标准的 text-to-SQL pipeline 呢？\n\n标准的 text-to-SQL pipeline 很脆弱，因为生成的 SQL 查询可能会出错。更糟糕的是，查询可能出错却不引发错误警报，从而返回一些不正确或无用的结果。\n\n👉 相反，agent 系统则可以检视输出结果并决定查询是否需要被更改，因此带来巨大的性能提升。\n\n让我们来一起构建这个 agent! 💪\n\n首先，我们构建一个 SQL 的环境：\n```py\nfrom sqlalchemy import (\n    create_engine,\n    MetaData,\n    Table,\n    Column,\n    String,\n    Integer,\n    Float,\n    insert,\n    inspect,\n    text,\n)\n\nengine = create_engine(\"sqlite:///:memory:\")\nmetadata_obj = MetaData()\n\n# create city SQL table\ntable_name = \"receipts\"\nreceipts = Table(\n    table_name,\n    metadata_obj,\n    Column(\"receipt_id\", Integer, primary_key=True),\n    Column(\"customer_name\", String(16), primary_key=True),\n    Column(\"price\", Float),\n    Column(\"tip\", Float),\n)\nmetadata_obj.create_all(engine)\n\nrows = [\n    {\"receipt_id\": 1, \"customer_name\": \"Alan Payne\", \"price\": 12.06, \"tip\": 1.20},\n    {\"receipt_id\": 2, \"customer_name\": \"Alex Mason\", \"price\": 23.86, \"tip\": 0.24},\n    {\"receipt_id\": 3, \"customer_name\": \"Woodrow Wilson\", \"price\": 53.43, \"tip\": 5.43},\n    {\"receipt_id\": 4, \"customer_name\": \"Margaret James\", \"price\": 21.11, \"tip\": 1.00},\n]\nfor row in rows:\n    stmt = insert(receipts).values(**row)\n    with engine.begin() as connection:\n        cursor = connection.execute(stmt)\n```\n\n### 构建 agent\n\n现在，我们构建一个 agent，它将使用 SQL 查询来回答问题。工具的 description 属性将被 agent 系统嵌入到 LLM 的提示中：它为 LLM 提供有关如何使用该工具的信息。这正是我们描述 SQL 表的地方。\n\n```py\ninspector = inspect(engine)\ncolumns_info = [(col[\"name\"], col[\"type\"]) for col in inspector.get_columns(\"receipts\")]\n\ntable_description = \"Columns:\\n\" + \"\\n\".join([f\"  - {name}: {col_type}\" for name, col_type in columns_info])\nprint(table_description)\n```\n\n```text\nColumns:\n  - receipt_id: INTEGER\n  - customer_name: VARCHAR(16)\n  - price: FLOAT\n  - tip: FLOAT\n```\n\n现在让我们构建我们的工具。它需要以下内容：（更多细节请参阅[工具文档](../tutorials/tools)）\n\n- 一个带有 `Args:` 部分列出参数的 docstring。\n- 输入和输出的type hints。\n\n```py\nfrom smolagents import tool\n\n@tool\ndef sql_engine(query: str) -> str:\n    \"\"\"\n    Allows you to perform SQL queries on the table. Returns a string representation of the result.\n    The table is named 'receipts'. Its description is as follows:\n        Columns:\n        - receipt_id: INTEGER\n        - customer_name: VARCHAR(16)\n        - price: FLOAT\n        - tip: FLOAT\n\n    Args:\n        query: The query to perform. This should be correct SQL.\n    \"\"\"\n    output = \"\"\n    with engine.connect() as con:\n        rows = con.execute(text(query))\n        for row in rows:\n            output += \"\\n\" + str(row)\n    return output\n```\n\n我们现在使用这个工具来创建一个 agent。我们使用 `CodeAgent`，这是 smolagent 的主要 agent 类：一个在代码中编写操作并根据 ReAct 框架迭代先前输出的 agent。\n\n这个模型是驱动 agent 系统的 LLM。`InferenceClientModel` 允许你使用 HF  Inference API 调用 LLM，无论是通过 Serverless 还是 Dedicated endpoint，但你也可以使用任何专有 API。\n\n```py\nfrom smolagents import CodeAgent, InferenceClientModel\n\nagent = CodeAgent(\n    tools=[sql_engine],\n    model=InferenceClientModel(model_id=\"meta-llama/Meta-Llama-3.1-8B-Instruct\"),\n)\nagent.run(\"Can you give me the name of the client who got the most expensive receipt?\")\n```\n\n### Level 2: 表连接\n\n现在让我们增加一些挑战！我们希望我们的 agent 能够处理跨多个表的连接。因此，我们创建一个新表，记录每个 receipt_id 的服务员名字！\n\n```py\ntable_name = \"waiters\"\nreceipts = Table(\n    table_name,\n    metadata_obj,\n    Column(\"receipt_id\", Integer, primary_key=True),\n    Column(\"waiter_name\", String(16), primary_key=True),\n)\nmetadata_obj.create_all(engine)\n\nrows = [\n    {\"receipt_id\": 1, \"waiter_name\": \"Corey Johnson\"},\n    {\"receipt_id\": 2, \"waiter_name\": \"Michael Watts\"},\n    {\"receipt_id\": 3, \"waiter_name\": \"Michael Watts\"},\n    {\"receipt_id\": 4, \"waiter_name\": \"Margaret James\"},\n]\nfor row in rows:\n    stmt = insert(receipts).values(**row)\n    with engine.begin() as connection:\n        cursor = connection.execute(stmt)\n```\n\n因为我们改变了表，我们需要更新 `SQLExecutorTool`，让 LLM 能够正确利用这个表的信息。\n\n```py\nupdated_description = \"\"\"Allows you to perform SQL queries on the table. Beware that this tool's output is a string representation of the execution output.\nIt can use the following tables:\"\"\"\n\ninspector = inspect(engine)\nfor table in [\"receipts\", \"waiters\"]:\n    columns_info = [(col[\"name\"], col[\"type\"]) for col in inspector.get_columns(table)]\n\n    table_description = f\"Table '{table}':\\n\"\n\n    table_description += \"Columns:\\n\" + \"\\n\".join([f\"  - {name}: {col_type}\" for name, col_type in columns_info])\n    updated_description += \"\\n\\n\" + table_description\n\nprint(updated_description)\n```\n\n因为这个request 比之前的要难一些，我们将 LLM 引擎切换到更强大的 [Qwen/Qwen3-Next-80B-A3B-Thinking](https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Thinking)！\n\n```py\nsql_engine.description = updated_description\n\nagent = CodeAgent(\n    tools=[sql_engine],\n    model=InferenceClientModel(model_id=\"Qwen/Qwen3-Next-80B-A3B-Thinking\"),\n)\n\nagent.run(\"Which waiter got more total money from tips?\")\n```\n\n它直接就能工作！设置过程非常简单，难道不是吗？\n\n这个例子到此结束！我们涵盖了这些概念：\n\n- 构建新工具。\n- 更新工具的描述。\n- 切换到更强大的 LLM 有助于 agent 推理。\n\n✅ 现在你可以构建你一直梦寐以求的 text-to-SQL 系统了！✨\n"
  },
  {
    "path": "docs/source/zh/examples/web_browser.md",
    "content": "# 使用Agent实现网页浏览器自动化 🤖🌐\n\n[[open-in-colab]]\n\n在本notebook中，我们将创建一个**基于Agent的网页浏览器自动化系统**！该系统可以自动导航网站、与网页元素交互并提取信息。\n\n该Agent将能够：\n\n- [x] 导航到网页\n- [x] 点击元素\n- [x] 在页面内搜索\n- [x] 处理弹出窗口和模态框\n- [x] 提取信息\n\n让我们一步步搭建这个系统！\n\n首先运行以下命令安装所需依赖：\n\n```bash\npip install smolagents selenium helium pillow -q\n```\n\n让我们导入所需的库并设置环境变量：\n\n```python\nfrom io import BytesIO\nfrom time import sleep\n\nimport helium\nfrom dotenv import load_dotenv\nfrom PIL import Image\nfrom selenium import webdriver\nfrom selenium.webdriver.common.by import By\nfrom selenium.webdriver.common.keys import Keys\n\nfrom smolagents import CodeAgent, tool\nfrom smolagents.agents import ActionStep\n\n# Load environment variables\nload_dotenv()\n```\n\n现在我们来创建核心的浏览器交互工具，使我们的Agent能够导航并与网页交互：\n\n```python\n@tool\ndef search_item_ctrl_f(text: str, nth_result: int = 1) -> str:\n    \"\"\"\n    Searches for text on the current page via Ctrl + F and jumps to the nth occurrence.\n    Args:\n        text: The text to search for\n        nth_result: Which occurrence to jump to (default: 1)\n    \"\"\"\n    elements = driver.find_elements(By.XPATH, f\"//*[contains(text(), '{text}')]\")\n    if nth_result > len(elements):\n        raise Exception(f\"Match n°{nth_result} not found (only {len(elements)} matches found)\")\n    result = f\"Found {len(elements)} matches for '{text}'.\"\n    elem = elements[nth_result - 1]\n    driver.execute_script(\"arguments[0].scrollIntoView(true);\", elem)\n    result += f\"Focused on element {nth_result} of {len(elements)}\"\n    return result\n\n@tool\ndef go_back() -> None:\n    \"\"\"Goes back to previous page.\"\"\"\n    driver.back()\n\n@tool\ndef close_popups() -> str:\n    \"\"\"\n    Closes any visible modal or pop-up on the page. Use this to dismiss pop-up windows!\n    This does not work on cookie consent banners.\n    \"\"\"\n    webdriver.ActionChains(driver).send_keys(Keys.ESCAPE).perform()\n```\n\n让我们配置使用Chrome浏览器并设置截图功能：\n\n```python\n# Configure Chrome options\nchrome_options = webdriver.ChromeOptions()\nchrome_options.add_argument(\"--force-device-scale-factor=1\")\nchrome_options.add_argument(\"--window-size=1000,1350\")\nchrome_options.add_argument(\"--disable-pdf-viewer\")\nchrome_options.add_argument(\"--window-position=0,0\")\n\n# Initialize the browser\ndriver = helium.start_chrome(headless=False, options=chrome_options)\n\n# Set up screenshot callback\ndef save_screenshot(memory_step: ActionStep, agent: CodeAgent) -> None:\n    sleep(1.0)  # Let JavaScript animations happen before taking the screenshot\n    driver = helium.get_driver()\n    current_step = memory_step.step_number\n    if driver is not None:\n        for previous_memory_step in agent.memory.steps:  # Remove previous screenshots for lean processing\n            if isinstance(previous_memory_step, ActionStep) and previous_memory_step.step_number <= current_step - 2:\n                previous_memory_step.observations_images = None\n        png_bytes = driver.get_screenshot_as_png()\n        image = Image.open(BytesIO(png_bytes))\n        print(f\"Captured a browser screenshot: {image.size} pixels\")\n        memory_step.observations_images = [image.copy()]  # Create a copy to ensure it persists\n\n    # Update observations with current URL\n    url_info = f\"Current url: {driver.current_url}\"\n    memory_step.observations = (\n        url_info if memory_step.observations is None else memory_step.observations + \"\\n\" + url_info\n    )\n```\n\n现在我们来创建网页自动化Agent：\n\n```python\nfrom smolagents import InferenceClientModel\n\n# Initialize the model\nmodel_id = \"meta-llama/Llama-3.3-70B-Instruct\"  # You can change this to your preferred model\nmodel = InferenceClientModel(model_id=model_id)\n\n# Create the agent\nagent = CodeAgent(\n    tools=[go_back, close_popups, search_item_ctrl_f],\n    model=model,\n    additional_authorized_imports=[\"helium\"],\n    step_callbacks=[save_screenshot],\n    max_steps=20,\n    verbosity_level=2,\n)\n\n# Import helium for the agent\nagent.python_executor(\"from helium import *\", agent.state)\n```\n\nAgent需要获得关于如何使用Helium进行网页自动化的指导。以下是我们将提供的操作说明：\n\n```python\nhelium_instructions = \"\"\"\nYou can use helium to access websites. Don't bother about the helium driver, it's already managed.\nWe've already ran \"from helium import *\"\nThen you can go to pages!\nCode:\n```py\ngo_to('github.com/trending')\n```<end_code>\n\nYou can directly click clickable elements by inputting the text that appears on them.\nCode:\n```py\nclick(\"Top products\")\n```<end_code>\n\nIf it's a link:\nCode:\n```py\nclick(Link(\"Top products\"))\n```<end_code>\n\nIf you try to interact with an element and it's not found, you'll get a LookupError.\nIn general stop your action after each button click to see what happens on your screenshot.\nNever try to login in a page.\n\nTo scroll up or down, use scroll_down or scroll_up with as an argument the number of pixels to scroll from.\nCode:\n```py\nscroll_down(num_pixels=1200) # This will scroll one viewport down\n```<end_code>\n\nWhen you have pop-ups with a cross icon to close, don't try to click the close icon by finding its element or targeting an 'X' element (this most often fails).\nJust use your built-in tool `close_popups` to close them:\nCode:\n```py\nclose_popups()\n```<end_code>\n\nYou can use .exists() to check for the existence of an element. For example:\nCode:\n```py\nif Text('Accept cookies?').exists():\n    click('I accept')\n```<end_code>\n\"\"\"\n```\n\n现在我们可以运行Agent执行任务了！让我们尝试在维基百科上查找信息：\n\n```python\nsearch_request = \"\"\"\nPlease navigate to https://en.wikipedia.org/wiki/Chicago and give me a sentence containing the word \"1992\" that mentions a construction accident.\n\"\"\"\n\nagent_output = agent.run(search_request + helium_instructions)\nprint(\"Final output:\")\nprint(agent_output)\n```\n\n您可以通过修改请求参数执行不同任务。例如，以下请求可帮助我判断是否需要更加努力工作：\n\n```python\ngithub_request = \"\"\"\nI'm trying to find how hard I have to work to get a repo in github.com/trending.\nCan you navigate to the profile for the top author of the top trending repo, and give me their total number of commits over the last year?\n\"\"\"\n\nagent_output = agent.run(github_request + helium_instructions)\nprint(\"Final output:\")\nprint(agent_output)\n```\n\n该系统在以下任务中尤为有效：\n\n- 从网站提取数据\n- 网页研究自动化\n- 用户界面测试与验证\n- 内容监控"
  },
  {
    "path": "docs/source/zh/guided_tour.md",
    "content": "# Agents - 导览\n\n[[open-in-colab]]\n\n在本导览中，您将学习如何构建一个 agent（智能体），如何运行它，以及如何自定义它以使其更好地适应您的使用场景。\n\n> [!TIP]\n> 译者注：Agent 的业内术语是“智能体”。本译文将保留 agent，不作翻译，以带来更高效的阅读体验。(在中文为主的文章中，It's easier to 注意到英文。Attention Is All You Need!)\n\n> [!TIP]\n> 中文社区发布了关于 smolagents 的介绍和实践讲解视频(来源：[Issue#80](https://github.com/huggingface/smolagents/issues/80))，你可以访问[这里](https://www.youtube.com/watch?v=wwN3oAugc4c)进行观看！\n\n### 构建您的 agent\n\n要初始化一个最小化的 agent，您至少需要以下两个参数：\n\n- `model`，一个为您的 agent 提供动力的文本生成模型 - 因为 agent 与简单的 LLM 不同，它是一个使用 LLM 作为引擎的系统。您可以使用以下任一选项：\n    - [`TransformersModel`] 使用预初始化的 `transformers` 管道在本地机器上运行推理\n    - [`InferenceClientModel`] 在底层使用 `huggingface_hub.InferenceClient`\n    - [`LiteLLMModel`] 让您通过 [LiteLLM](https://docs.litellm.ai/) 调用 100+ 不同的模型！\n    - [`AzureOpenAIModel`] 允许您使用部署在 [Azure](https://azure.microsoft.com/en-us/products/ai-services/openai-service) 中的 OpenAI 模型。\n    - [`MLXModel`] 可创建 [mlx-lm](https://pypi.org/project/mlx-lm/) 流水线，以便在本地机器上运行推理。\n\n- `tools`，agent 可以用来解决任务的 `Tools` 列表。它可以是一个空列表。您还可以通过定义可选参数 `add_base_tools=True` 在您的 `tools` 列表之上添加默认工具箱。\n\n一旦有了这两个参数 `tools` 和 `model`，您就可以创建一个 agent 并运行它。您可以使用任何您喜欢的 LLM，无论是通过 [Hugging Face API](https://huggingface.co/docs/api-inference/en/index)、[transformers](https://github.com/huggingface/transformers/)、[ollama](https://ollama.com/)、[LiteLLM](https://www.litellm.ai/)、[Azure OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service)，还是[mlx-lm](https://pypi.org/project/mlx-lm/).。\n\n<hfoptions id=\"选择一个LLM\">\n<hfoption id=\"Hugging Face API\">\n\nHugging Face API 可以免费使用而无需 token，但会有速率限制。\n\n要访问受限模型或使用 PRO 账户提高速率限制，您需要设置环境变量 `HF_TOKEN` 或在初始化 `InferenceClientModel` 时传递 `token` 变量。\n\n```python\nfrom smolagents import CodeAgent, InferenceClientModel\n\nmodel_id = \"meta-llama/Llama-3.3-70B-Instruct\"\n\nmodel = InferenceClientModel(model_id=model_id, token=\"<YOUR_HUGGINGFACEHUB_API_TOKEN>\")\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n</hfoption>\n<hfoption id=\"本地Transformers模型\">\n\n```python\n# !pip install 'smolagents[transformers]'\nfrom smolagents import CodeAgent, TransformersModel\n\nmodel_id = \"meta-llama/Llama-3.2-3B-Instruct\"\n\nmodel = TransformersModel(model_id=model_id)\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n</hfoption>\n<hfoption id=\"OpenAI或Anthropic API\">\n\n要使用 `LiteLLMModel`，您需要设置环境变量 `ANTHROPIC_API_KEY` 或 `OPENAI_API_KEY`，或者在初始化时传递 `api_key` 变量。\n\n```python\n# !pip install 'smolagents[litellm]'\nfrom smolagents import CodeAgent, LiteLLMModel\n\nmodel = LiteLLMModel(model_id=\"anthropic/claude-3-5-sonnet-latest\", api_key=\"YOUR_ANTHROPIC_API_KEY\") # 也可以使用 'gpt-4o'\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n</hfoption>\n<hfoption id=\"Ollama\">\n\n```python\n# !pip install 'smolagents[litellm]'\nfrom smolagents import CodeAgent, LiteLLMModel\n\nmodel = LiteLLMModel(\n    model_id=\"ollama_chat/llama3.2\", # 这个模型对于 agent 行为来说有点弱\n    api_base=\"http://localhost:11434\", # 如果需要可以替换为远程 open-ai 兼容服务器\n    api_key=\"YOUR_API_KEY\" # 如果需要可以替换为 API key\n    num_ctx=8192 # https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator\n)\n\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n</hfoption>\n<hfoption id=\"Azure OpenAI\">\n\n要连接到 Azure OpenAI，您可以直接使用 `AzureOpenAIModel`，或使用 `LiteLLMModel` 并进行相应配置。\n\n初始化 `AzureOpenAIModel` 实例时，需要传递模型部署名称，可选择以下任一种方式：1.传递 `azure_endpoint`、`api_key` 和 `api_version` 参数；2.设置环境变量 `AZURE_OPENAI_ENDPOINT`、`AZURE_OPENAI_API_KEY` 和 `OPENAI_API_VERSION`\n\n```python\n# !pip install 'smolagents[openai]'\nfrom smolagents import CodeAgent, AzureOpenAIModel\n\nmodel = AzureOpenAIModel(model_id=\"gpt-4o-mini\")\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n    \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n\n也可按如下方式配置 `LiteLLMModel` 连接 Azure OpenAI：\n\n- 将模型部署名称作为 `model_id` 参数传递，并确保其前缀为 `azure/`\n- 确保设置环境变量 `AZURE_API_VERSION`\n- 任选其一：1.传递 `api_base` 和 `api_key` 参数；2.设置环境变量 `AZURE_API_KEY` 和 `AZURE_API_BASE`\n\n```python\nimport os\nfrom smolagents import CodeAgent, LiteLLMModel\n\nAZURE_OPENAI_CHAT_DEPLOYMENT_NAME=\"gpt-35-turbo-16k-deployment\" # example of deployment name\n\nos.environ[\"AZURE_API_KEY\"] = \"\" # api_key\nos.environ[\"AZURE_API_BASE\"] = \"\" # \"https://example-endpoint.openai.azure.com\"\nos.environ[\"AZURE_API_VERSION\"] = \"\" # \"2024-10-01-preview\"\n\nmodel = LiteLLMModel(model_id=\"azure/\" + AZURE_OPENAI_CHAT_DEPLOYMENT_NAME)\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\n\nagent.run(\n   \"Could you give me the 118th number in the Fibonacci sequence?\",\n)\n```\n\n</hfoption>\n<hfoption id=\"mlx-lm\">\n\n```python\n# !pip install 'smolagents[mlx-lm]'\nfrom smolagents import CodeAgent, MLXModel\n\nmlx_model = MLXModel(\"mlx-community/Qwen2.5-Coder-32B-Instruct-4bit\")\nagent = CodeAgent(model=mlx_model, tools=[], add_base_tools=True)\n\nagent.run(\"Could you give me the 118th number in the Fibonacci sequence?\")\n```\n\n</hfoption>\n</hfoptions>\n\n#### CodeAgent 和 ToolCallingAgent\n\n[`CodeAgent`] 是我们的默认 agent。它将在每一步编写并执行 Python 代码片段。\n\n默认情况下，执行是在您的本地环境中完成的。\n这应该是安全的，因为唯一可以调用的函数是您提供的工具（特别是如果只有 Hugging Face 的工具）和一组预定义的安全函数，如 `print` 或 `math` 模块中的函数，所以您已经限制了可以执行的内容。\n\nPython 解释器默认也不允许在安全列表之外导入，所以所有最明显的攻击都不应该成为问题。\n您可以通过在初始化 [`CodeAgent`] 时将授权模块作为字符串列表传递给参数 `additional_authorized_imports` 来授权额外的导入：\n\n```py\nfrom smolagents import CodeAgent\n\nmodel = InferenceClientModel()\nagent = CodeAgent(tools=[], model=model, additional_authorized_imports=['requests', 'bs4'])\nagent.run(\"Could you get me the title of the page at url 'https://huggingface.co/blog'?\")\n```\n\n> [!WARNING]\n> LLM 可以生成任意代码然后执行：不要添加任何不安全的导入！\n\n如果生成的代码尝试执行非法操作或出现常规 Python 错误，执行将停止。\n\n您也可以使用 [E2B 代码执行器](https://e2b.dev/docs#what-is-e2-b) 或 Docker 而不是本地 Python 解释器。对于 E2B，首先 [设置 `E2B_API_KEY` 环境变量](https://e2b.dev/dashboard?tab=keys)，然后在初始化 agent 时传递 `executor_type=\"e2b\"`。对于 Docker，在初始化时传递 `executor_type=\"docker\"`。\n\n> [!TIP]\n> 在 [该教程中](tutorials/secure_code_execution) 了解更多关于代码执行的内容。\n\n我们还支持广泛使用的将动作编写为 JSON-like 块的方式：[`ToolCallingAgent`]，它的工作方式与 [`CodeAgent`] 非常相似，当然没有 `additional_authorized_imports`，因为它不执行代码：\n\n```py\nfrom smolagents import ToolCallingAgent, WebSearchTool\n\nagent = ToolCallingAgent(tools=[WebSearchTool()], model=model)\nagent.run(\"Could you get me the title of the page at url 'https://huggingface.co/blog'?\")\n```\n\n### 检查 agent 运行\n\n以下是一些有用的属性，用于检查运行后发生了什么：\n- `agent.logs` 存储 agent 的细粒度日志。在 agent 运行的每一步，所有内容都会存储在一个字典中，然后附加到 `agent.logs` 中。\n- 运行 `agent.write_memory_to_messages()` 会为 LLM 创建一个 agent 日志的内部内存，作为聊天消息列表。此方法会遍历日志的每一步，并仅存储它感兴趣的内容作为消息：例如，它会将系统提示和任务存储为单独的消息，然后对于每一步，它会将 LLM 输出存储为一条消息，工具调用输出存储为另一条消息。如果您想要更高级别的视图 - 但不是每个日志都会被此方法转录。\n\n## 工具\n\n工具是 agent 使用的原子函数。为了被 LLM 使用，它还需要一些构成其 API 的属性，这些属性将用于向 LLM 描述如何调用此工具：\n- 名称\n- 描述\n- 输入类型和描述\n- 输出类型\n\n例如，您可以查看 [`PythonInterpreterTool`]：它有一个名称、描述、输入描述、输出类型和一个执行操作的 `forward` 方法。\n\n当 agent 初始化时，工具属性用于生成工具描述，该描述被嵌入到 agent 的系统提示中。这让 agent 知道它可以使用哪些工具以及为什么。\n\n### 默认工具箱\n\n`smolagents` 附带了一个用于增强 agent 的默认工具箱，您可以在初始化时通过参数 `add_base_tools=True` 将其添加到您的 agent 中：\n\n- **DuckDuckGo 网页搜索**：使用 DuckDuckGo 浏览器执行网页搜索。\n- **Python 代码解释器**：在安全环境中运行 LLM 生成的 Python 代码。只有在使用 `add_base_tools=True` 初始化 [`ToolCallingAgent`] 时才会添加此工具，因为基于代码的 agent 已经可以原生执行 Python 代码\n- **转录器**：基于 Whisper-Turbo 构建的语音转文本管道，将音频转录为文本。\n\n您可以通过调用 [`load_tool`] 函数和要执行的任务手动使用工具。\n\n```python\nfrom smolagents import WebSearchTool\n\nsearch_tool = WebSearchTool()\nprint(search_tool(\"Who's the current president of Russia?\"))\n```\n\n### 创建一个新工具\n\n您可以创建自己的工具，用于 Hugging Face 默认工具未涵盖的用例。\n例如，让我们创建一个工具，返回 Hub 上给定任务下载量最多的模型。\n\n您将从以下代码开始。\n\n```python\nfrom huggingface_hub import list_models\n\ntask = \"text-classification\"\n\nmost_downloaded_model = next(iter(list_models(filter=task, sort=\"downloads\", direction=-1)))\nprint(most_downloaded_model.id)\n```\n\n这段代码可以通过将其包装在一个函数中并添加 `tool` 装饰器快速转换为工具：\n这不是构建工具的唯一方法：您可以直接将其定义为 [`Tool`] 的子类，这为您提供了更多的灵活性，例如初始化重型类属性的可能性。\n\n让我们看看这两种选项的工作原理：\n\n<hfoptions id=\"构建工具\">\n<hfoption id=\"使用@tool装饰一个函数\">\n\n```py\nfrom smolagents import tool\n\n@tool\ndef model_download_tool(task: str) -> str:\n    \"\"\"\n    This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub.\n    It returns the name of the checkpoint.\n\n    Args:\n        task: The task for which to get the download count.\n    \"\"\"\n    most_downloaded_model = next(iter(list_models(filter=task, sort=\"downloads\", direction=-1)))\n    return most_downloaded_model.id\n```\n\n该函数需要：\n- 一个清晰的名称。名称应该足够描述此工具的功能，以帮助为 agent 提供动力的 LLM。由于此工具返回任务下载量最多的模型，我们将其命名为 `model_download_tool`。\n- 输入和输出的类型提示\n- 一个描述，其中包括一个 'Args:' 部分，其中每个参数都被描述（这次没有类型指示，它将从类型提示中提取）。与工具名称一样，此描述是为您的 agent 提供动力的 LLM 的说明书，所以不要忽视它。\n所有这些元素将在初始化时自动嵌入到 agent 的系统提示中：因此要努力使它们尽可能清晰！\n\n> [!TIP]\n> 此定义格式与 `apply_chat_template` 中使用的工具模式相同，唯一的区别是添加了 `tool` 装饰器：[这里](https://huggingface.co/blog/unified-tool-use#passing-tools-to-a-chat-template) 了解更多关于我们的工具使用 API。\n\n\n然后您可以直接初始化您的 agent：\n```py\nfrom smolagents import CodeAgent, InferenceClientModel\nagent = CodeAgent(tools=[model_download_tool], model=InferenceClientModel())\nagent.run(\n    \"Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub?\"\n)\n```\n</hfoption>\n<hfoption id=\"子类化Tool\">\n\n```py\nfrom smolagents import Tool\n\nclass ModelDownloadTool(Tool):\n    name = \"model_download_tool\"\n    description = \"This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub. It returns the name of the checkpoint.\"\n    inputs = {\"task\": {\"type\": \"string\", \"description\": \"The task for which to get the download count.\"}}\n    output_type = \"string\"\n\n    def forward(self, task: str) -> str:\n        most_downloaded_model = next(iter(list_models(filter=task, sort=\"downloads\", direction=-1)))\n        return most_downloaded_model.id\n```\n\n子类需要以下属性：\n- 一个清晰的 `name`。名称应该足够描述此工具的功能，以帮助为 agent 提供动力的 LLM。由于此工具返回任务下载量最多的模型，我们将其命名为 `model_download_tool`。\n- 一个 `description`。与 `name` 一样，此描述是为您的 agent 提供动力的 LLM 的说明书，所以不要忽视它。\n- 输入类型和描述\n- 输出类型\n\n\n然后您可以直接初始化您的 agent：\n```py\nfrom smolagents import CodeAgent, InferenceClientModel\nagent = CodeAgent(tools=[ModelDownloadTool()], model=InferenceClientModel())\nagent.run(\n    \"Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub?\"\n)\n```\n所有这些属性将在初始化时自动嵌入到 agent 的系统提示中：因此要努力使它们尽可能清晰！\n</hfoption>\n</hfoptions>\n\n您将获得以下日志：\n```text\n╭──────────────────────────────────────── New run ─────────────────────────────────────────╮\n│                                                                                          │\n│ Can you give me the name of the model that has the most downloads in the 'text-to-video' │\n│ task on the Hugging Face Hub?                                                            │\n│                                                                                          │\n╰─ InferenceClientModel - Qwen/Qwen2.5-Coder-32B-Instruct ───────────────────────────────────────────╯\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n╭─ Executing this code: ───────────────────────────────────────────────────────────────────╮\n│   1 model_name = model_download_tool(task=\"text-to-video\")                               │\n│   2 print(model_name)                                                                    │\n╰──────────────────────────────────────────────────────────────────────────────────────────╯\nExecution logs:\nByteDance/AnimateDiff-Lightning\n\nOut: None\n[Step 0: Duration 0.27 seconds| Input tokens: 2,069 | Output tokens: 60]\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n╭─ Executing this code: ───────────────────────────────────────────────────────────────────╮\n│   1 final_answer(\"ByteDance/AnimateDiff-Lightning\")                                      │\n╰──────────────────────────────────────────────────────────────────────────────────────────╯\nOut - Final answer: ByteDance/AnimateDiff-Lightning\n[Step 1: Duration 0.10 seconds| Input tokens: 4,288 | Output tokens: 148]\nOut[20]: 'ByteDance/AnimateDiff-Lightning'\n```\n\n> [!TIP]\n> 在 [专用教程](./tutorials/tools#what-is-a-tool-and-how-to-build-one) 中了解更多关于工具的内容。\n\n## 多 agent\n\n多 agent 系统是随着微软的框架 [Autogen](https://huggingface.co/papers/2308.08155) 引入的。\n\n在这种类型的框架中，您有多个 agent 一起工作来解决您的任务，而不是只有一个。\n经验表明，这在大多数基准测试中表现更好。这种更好表现的原因在概念上很简单：对于许多任务，与其使用一个全能系统，您更愿意将单元专门用于子任务。在这里，拥有具有单独工具集和内存的 agent 可以实现高效的专业化。例如，为什么要用网页搜索 agent 访问的所有网页内容填充代码生成 agent 的内存？最好将它们分开。\n\n您可以使用 `smolagents` 轻松构建分层多 agent 系统。\n\n为此，将 agent 封装在 [`ManagedAgent`] 对象中。此对象需要参数 `agent`、`name` 和 `description`，这些参数将嵌入到管理 agent 的系统提示中，以让它知道如何调用此托管 agent，就像我们对工具所做的那样。\n\n以下是一个使用我们的 [`WebSearchTool`] 制作一个管理特定网页搜索 agent 的 agent 的示例：\n\n```py\nfrom smolagents import CodeAgent, InferenceClientModel, WebSearchTool, ManagedAgent\n\nmodel = InferenceClientModel()\n\nweb_agent = CodeAgent(tools=[WebSearchTool()], model=model)\n\nmanaged_web_agent = ManagedAgent(\n    agent=web_agent,\n    name=\"web_search\",\n    description=\"Runs web searches for you. Give it your query as an argument.\"\n)\n\nmanager_agent = CodeAgent(\n    tools=[], model=model, managed_agents=[managed_web_agent]\n)\n\nmanager_agent.run(\"Who is the CEO of Hugging Face?\")\n```\n\n> [!TIP]\n> 有关高效多 agent 实现的深入示例，请参阅 [我们如何将多 agent 系统推向 GAIA 排行榜的顶部](https://huggingface.co/blog/beating-gaia)。\n\n\n## 与您的 agent 交谈并在酷炫的 Gradio 界面中可视化其思考过程\n\n您可以使用 `GradioUI` 交互式地向您的 agent 提交任务并观察其思考和执行过程，以下是一个示例：\n\n```py\nfrom smolagents import (\n    load_tool,\n    CodeAgent,\n    InferenceClientModel,\n    GradioUI\n)\n\n# 从 Hub 导入工具\nimage_generation_tool = load_tool(\"m-ric/text-to-image\")\n\nmodel = InferenceClientModel(model_id=model_id)\n\n# 使用图像生成工具初始化 agent\nagent = CodeAgent(tools=[image_generation_tool], model=model)\n\nGradioUI(agent).launch()\n```\n\n在底层，当用户输入新答案时，agent 会以 `agent.run(user_request, reset=False)` 启动。\n`reset=False` 标志意味着在启动此新任务之前不会刷新 agent 的内存，这使得对话可以继续。\n\n您也可以在其他 agent 化应用程序中使用此 `reset=False` 参数来保持对话继续。\n\n## 下一步\n\n最后，当您按需配置好agent后，即可将其分享至 Hub！\n\n```py\nagent.push_to_hub(\"m-ric/my_agent\")\n```\n\n类似地，若要加载已推送至 Hub 的agent，在信任其工具代码的前提下，可使用：\n\n```py\nagent.from_hub(\"m-ric/my_agent\", trust_remote_code=True)\n```\n\n要更深入地使用，您将需要查看我们的教程：\n- [我们的代码 agent 如何工作的解释](./tutorials/secure_code_execution)\n- [本指南关于如何构建好的 agent](./tutorials/building_good_agents)。\n- [工具使用的深入指南](./tutorials/tools)。\n"
  },
  {
    "path": "docs/source/zh/index.md",
    "content": "# `smolagents`\n\n这是构建强大 agent 的最简单框架！顺便问一下，什么是 \"agent\"？我们在[此页面](conceptual_guides/intro_agents)提供了我们的定义，您还可以找到关于何时使用或不使用它们的建议（剧透：通常不使用 agent 会更好）。\n\n> [!TIP]\n> 译者注：Agent 的业内术语是“智能体”。本译文将保留 agent，不作翻译，以带来更高效的阅读体验。(在中文为主的文章中，It's easier to 注意到英文。Attention Is All You Need!)\n\n本库提供：\n\n✨ **简洁性**：Agent 逻辑仅需约千行代码。我们将抽象保持在原始代码之上的最小形态！\n\n🌐 **支持任何 LLM**：支持通过 Hub 托管的模型，使用其 `transformers` 版本或通过我们的推理 API 加载，也支持 OpenAI、Anthropic 等模型。使用任何 LLM 为 agent 提供动力都非常容易。\n\n🧑‍💻 **一流的代码 agent 支持**，即编写代码作为其操作的 agent（与\"用于编写代码的 agent\"相对），[在此了解更多](tutorials/secure_code_execution)。\n\n🤗 **Hub 集成**：您可以在 Hub 上共享和加载工具，更多功能即将推出！\n\n<div class=\"mt-10\">\n  <div class=\"w-full flex flex-col space-y-4 md:space-y-0 md:grid md:grid-cols-2 md:gap-y-4 md:gap-x-5\">\n    <a class=\"!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg\" href=\"./guided_tour\"\n      ><div class=\"w-full text-center bg-gradient-to-br from-blue-400 to-blue-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed\">导览</div>\n      <p class=\"text-gray-700\">学习基础知识并熟悉使用 agent。如果您是第一次使用 agent，请从这里开始！</p>\n    </a>\n    <a class=\"!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg\" href=\"./examples/text_to_sql\"\n      ><div class=\"w-full text-center bg-gradient-to-br from-indigo-400 to-indigo-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed\">操作指南</div>\n      <p class=\"text-gray-700\">实用指南，帮助您实现特定目标：创建一个生成和测试 SQL 查询的 agent！</p>\n    </a>\n    <a class=\"!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg\" href=\"./conceptual_guides/intro_agents\"\n      ><div class=\"w-full text-center bg-gradient-to-br from-pink-400 to-pink-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed\">概念指南</div>\n      <p class=\"text-gray-700\">高级解释，帮助您更好地理解重要主题。</p>\n   </a>\n    <a class=\"!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg\" href=\"./tutorials/building_good_agents\"\n      ><div class=\"w-full text-center bg-gradient-to-br from-purple-400 to-purple-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed\">教程</div>\n      <p class=\"text-gray-700\">涵盖构建 agent 重要方面的横向教程。</p>\n    </a>\n  </div>\n</div>\n"
  },
  {
    "path": "docs/source/zh/reference/agents.md",
    "content": "# Agents（智能体）\n\n<Tip warning={true}>\n\nSmolagents 是一个实验性的 API，可能会随时发生变化。由于 API 或底层模型可能发生变化，代理返回的结果也可能有所不同。\n\n</Tip>\n\n要了解有关智能体和工具的更多信息，请务必阅读[入门指南](../index)。本页面包含基础类的 API 文档。\n\n## 智能体（Agents）\n\n我们的智能体继承自 [`MultiStepAgent`]，这意味着它们可以执行多步操作，每一步包含一个思考（thought），然后是一个工具调用和执行。请阅读[概念指南](../conceptual_guides/react)以了解更多信息。\n\n我们提供两种类型的代理，它们基于主要的 [`Agent`] 类：\n  - [`CodeAgent`] 是默认代理，它以 Python 代码编写工具调用。\n  - [`ToolCallingAgent`] 以 JSON 编写工具调用。\n\n两者在初始化时都需要提供参数 `model` 和工具列表 `tools`。\n\n### 智能体类\n\n[[autodoc]] MultiStepAgent\n\n[[autodoc]] CodeAgent\n\n[[autodoc]] ToolCallingAgent\n\n### stream_to_gradio\n\n[[autodoc]] stream_to_gradio\n\n### GradioUI\n\n> [!TIP]\n> 您必须安装 `gradio` 才能使用 UI。如果尚未安装，请运行 `pip install 'smolagents[gradio]'`。\n\n[[autodoc]] GradioUI\n\n## 提示（Prompts）\n\n[[autodoc]] smolagents.agents.PromptTemplates\n\n[[autodoc]] smolagents.agents.PlanningPromptTemplate\n\n[[autodoc]] smolagents.agents.ManagedAgentPromptTemplate\n\n[[autodoc]] smolagents.agents.FinalAnswerPromptTemplate\n"
  },
  {
    "path": "docs/source/zh/reference/models.md",
    "content": "# 模型\n\n<Tip warning={true}>\n\nSmolagents 是一个实验性 API，其可能会随时发生更改。由于 API 或底层模型可能会变化，智能体返回的结果可能会有所不同。\n\n</Tip>\n\n要了解有关智能体和工具的更多信息，请务必阅读[入门指南](../index)。此页面包含底层类的 API 文档。\n\n## 模型\n\n您可以自由创建和使用自己的模型为智能体提供支持。\n\n您可以使用任何 `model` 可调用对象作为智能体的模型，只要满足以下条件：\n1. 它遵循[消息格式](./chat_templating)（`List[Dict[str, str]]`），将其作为输入 `messages`，并返回一个 `str`。\n2. 它在生成的序列到达 `stop_sequences` 参数中指定的内容之前停止生成输出。\n\n要定义您的 LLM，可以创建一个 `custom_model` 方法，该方法接受一个 [messages](./chat_templating) 列表，并返回一个包含 `.content` 属性的对象，其中包含生成的文本。此可调用对象还需要接受一个 `stop_sequences` 参数，用于指示何时停止生成。\n\n```python\nfrom huggingface_hub import login, InferenceClient\n\nlogin(\"<YOUR_HUGGINGFACEHUB_API_TOKEN>\")\n\nmodel_id = \"meta-llama/Llama-3.3-70B-Instruct\"\n\nclient = InferenceClient(model=model_id)\n\ndef custom_model(messages, stop_sequences=[\"Task\"]):\n    response = client.chat_completion(messages, stop=stop_sequences, max_tokens=1000)\n    answer = response.choices[0].message\n    return answer\n```\n\n此外，`custom_model` 还可以接受一个 `grammar` 参数。如果在智能体初始化时指定了 `grammar`，则此参数将在调用模型时传递，以便进行[约束生成](https://huggingface.co/docs/text-generation-inference/conceptual/guidance)，从而强制生成格式正确的智能体输出。\n\n### TransformersModel\n\n为了方便起见，我们添加了一个 `TransformersModel`，该模型通过为初始化时指定的 `model_id` 构建一个本地 `transformers` pipeline 来实现上述功能。\n\n```python\nfrom smolagents import TransformersModel\n\nmodel = TransformersModel(model_id=\"HuggingFaceTB/SmolLM-135M-Instruct\")\n\nprint(model([{\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"Ok!\"}]}], stop_sequences=[\"great\"]))\n```\n```text\n>>> What a\n```\n\n> [!TIP]\n> 您必须在机器上安装 `transformers` 和 `torch`。如果尚未安装，请运行 `pip install 'smolagents[transformers]'`。\n\n[[autodoc]] TransformersModel\n\n### InferenceClientModel\n\n`InferenceClientModel` 封装了 huggingface_hub 的 [InferenceClient](https://huggingface.co/docs/huggingface_hub/main/en/guides/inference)，用于执行 LLM。它支持 HF 的 [Inference API](https://huggingface.co/docs/api-inference/index) 以及 Hub 上所有可用的[Inference Providers](https://huggingface.co/blog/inference-providers)。\n\n```python\nfrom smolagents import InferenceClientModel\n\nmessages = [\n  {\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"Hello, how are you?\"}]}\n]\n\nmodel = InferenceClientModel()\nprint(model(messages))\n```\n```text\n>>> Of course! If you change your mind, feel free to reach out. Take care!\n```\n[[autodoc]] InferenceClientModel\n\n### LiteLLMModel\n\n`LiteLLMModel` 利用 [LiteLLM](https://www.litellm.ai/) 支持来自不同提供商的 100+ 个 LLM。您可以在模型初始化时传递 `kwargs`，这些参数将在每次使用模型时被使用，例如下面的示例中传递了 `temperature`。\n\n```python\nfrom smolagents import LiteLLMModel\n\nmessages = [\n  {\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"Hello, how are you?\"}]}\n]\n\nmodel = LiteLLMModel(model_id=\"anthropic/claude-3-5-sonnet-latest\", temperature=0.2, max_tokens=10)\nprint(model(messages))\n```\n\n[[autodoc]] LiteLLMModel\n\n### OpenAIModel\n\n此类允许您调用任何 OpenAIServer 兼容模型。\n以下是设置方法（您可以自定义 `api_base` URL 指向其他服务器）：\n```py\nimport os\nfrom smolagents import OpenAIModel\n\nmodel = OpenAIModel(\n    model_id=\"gpt-4o\",\n    api_base=\"https://api.openai.com/v1\",\n    api_key=os.environ[\"OPENAI_API_KEY\"],\n)\n```\n\n[[autodoc]] OpenAIModel\n\n### AzureOpenAIModel\n\n`AzureOpenAIModel` 允许您连接到任何 Azure OpenAI 部署。\n\n下面是设置示例，请注意，如果已经设置了相应的环境变量，您可以省略 `azure_endpoint`、`api_key` 和 `api_version` 参数——环境变量包括 `AZURE_OPENAI_ENDPOINT`、`AZURE_OPENAI_API_KEY` 和 `OPENAI_API_VERSION`。\n\n请注意，`OPENAI_API_VERSION` 没有 `AZURE_` 前缀，这是由于底层 [openai](https://github.com/openai/openai-python) 包的设计所致。\n\n```py\nimport os\n\nfrom smolagents import AzureOpenAIModel\n\nmodel = AzureOpenAIModel(\n    model_id = os.environ.get(\"AZURE_OPENAI_MODEL\"),\n    azure_endpoint=os.environ.get(\"AZURE_OPENAI_ENDPOINT\"),\n    api_key=os.environ.get(\"AZURE_OPENAI_API_KEY\"),\n    api_version=os.environ.get(\"OPENAI_API_VERSION\")    \n)\n```\n\n[[autodoc]] AzureOpenAIModel\n\n### MLXModel\n\n```python\nfrom smolagents import MLXModel\n\nmodel = MLXModel(model_id=\"HuggingFaceTB/SmolLM-135M-Instruct\")\n\nprint(model([{\"role\": \"user\", \"content\": \"Ok!\"}], stop_sequences=[\"great\"]))\n```\n```text\n>>> What a\n```\n\n> [!TIP]\n> 您必须在机器上安装 `mlx-lm`。如果尚未安装，请运行 `pip install 'smolagents[mlx-lm]'`。\n\n[[autodoc]] MLXModel\n"
  },
  {
    "path": "docs/source/zh/reference/tools.md",
    "content": "# 工具\n\n<Tip warning={true}>\n\nSmolagents 是一个实验性 API，可能会随时更改。由于 API 或底层模型可能发生变化，代理返回的结果可能会有所不同。\n\n</Tip>\n\n要了解更多关于智能体和工具的信息，请务必阅读[入门指南](../index)。本页面包含底层类的 API 文档。\n\n## 工具\n\n### load_tool\n\n[[autodoc]] load_tool\n\n### tool\n\n[[autodoc]] tool\n\n### Tool\n\n[[autodoc]] Tool\n\n### launch_gradio_demo\n\n[[autodoc]] launch_gradio_demo\n\n## 默认工具\n\n### PythonInterpreterTool\n\n[[autodoc]] PythonInterpreterTool\n\n### FinalAnswerTool\n\n[[autodoc]] FinalAnswerTool\n\n### UserInputTool\n\n[[autodoc]] UserInputTool\n\n### DuckDuckGoSearchTool\n\n[[autodoc]] DuckDuckGoSearchTool\n\n### GoogleSearchTool\n\n[[autodoc]] GoogleSearchTool\n\n### VisitWebpageTool\n\n[[autodoc]] VisitWebpageTool\n\n### SpeechToTextTool\n\n[[autodoc]] SpeechToTextTool\n\n## 工具集合\n\n[[autodoc]] ToolCollection\n\n## 智能体类型\n\n智能体可以处理工具之间的任何类型的对象；工具是完全多模态的，可以接受和返回文本、图像、音频、视频以及其他类型的对象。为了增加工具之间的兼容性，以及正确呈现在 ipython（jupyter、colab、ipython notebooks 等）中的返回结果，我们为这些类型实现了包装类。\n\n被包装的对象应该继续保持其初始行为；例如，一个文本对象应继续表现为字符串，一个图像对象应继续表现为 `PIL.Image`。\n\n这些类型有三个特定的用途：\n\n- 调用 `to_raw` 方法时，应返回底层对象\n- 调用 `to_string` 方法时，应将对象转换为字符串：对于 `AgentText` 类型，可以直接返回字符串；对于其他实例，则返回对象序列化版本的路径\n- 在 ipython 内核中显示时，应正确显示对象\n\n### AgentText\n\n[[autodoc]] smolagents.agent_types.AgentText\n\n### AgentImage\n\n[[autodoc]] smolagents.agent_types.AgentImage\n\n### AgentAudio\n\n[[autodoc]] smolagents.agent_types.AgentAudio\n"
  },
  {
    "path": "docs/source/zh/tutorials/building_good_agents.md",
    "content": "# 构建好用的 agent\n\n[[open-in-colab]]\n\n能良好工作的 agent 和不能工作的 agent 之间，有天壤之别。\n我们怎么样才能构建出属于前者的 agent 呢？\n在本指南中，我们将看到构建 agent 的最佳实践。\n\n> [!TIP]\n> 如果你是 agent 构建的新手，请确保首先阅读 [agent 介绍](../conceptual_guides/intro_agents) 和 [smolagents 导览](../guided_tour)。\n\n### 最好的 agent 系统是最简单的：尽可能简化工作流\n\n在你的工作流中赋予 LLM 一些自主权，会引入一些错误风险。\n\n经过良好编程的 agent 系统，通常具有良好的错误日志记录和重试机制，因此 LLM 引擎有机会自我纠错。但为了最大限度地降低 LLM 错误的风险，你应该简化你的工作流！\n\n让我们回顾一下 [agent 介绍](../conceptual_guides/intro_agents) 中的例子：一个为冲浪旅行公司回答用户咨询的机器人。\n与其让 agent 每次被问及新的冲浪地点时，都分别调用 \"旅行距离 API\" 和 \"天气 API\"，你可以只创建一个统一的工具 \"return_spot_information\"，一个同时调用这两个 API，并返回它们连接输出的函数。\n\n这可以降低成本、延迟和错误风险！\n\n主要的指导原则是：尽可能减少 LLM 调用的次数。\n\n这可以带来一些启发：\n- 尽可能把两个工具合并为一个，就像我们两个 API 的例子。\n- 尽可能基于确定性函数，而不是 agent 决策，来实现逻辑。\n\n### 改善流向 LLM 引擎的信息流\n\n记住，你的 LLM 引擎就像一个 ~智能~ 机器人，被关在一个房间里，与外界唯一的交流方式是通过门缝传递的纸条。\n\n如果你没有明确地将信息放入其提示中，它将不知道发生的任何事情。\n\n所以首先要让你的任务非常清晰！\n由于 agent 由 LLM 驱动，任务表述的微小变化可能会产生完全不同的结果。\n\n然后，改善工具使用中流向 agent 的信息流。\n\n需要遵循的具体指南：\n- 每个工具都应该记录（只需在工具的 `forward` 方法中使用 `print` 语句）对 LLM 引擎可能有用的所有信息。\n  - 特别是，记录工具执行错误的详细信息会很有帮助！\n\n例如，这里有一个根据位置和日期时间检索天气数据的工具：\n\n首先，这是一个糟糕的版本：\n```python\nimport datetime\nfrom smolagents import tool\n\ndef get_weather_report_at_coordinates(coordinates, date_time):\n    # 虚拟函数，返回 [温度（°C），降雨风险（0-1），浪高（m）]\n    return [28.0, 0.35, 0.85]\n\ndef get_coordinates_from_location(location):\n    # 返回虚拟坐标\n    return [3.3, -42.0]\n\n@tool\ndef get_weather_api(location: str, date_time: str) -> str:\n    \"\"\"\n    Returns the weather report.\n\n    Args:\n        location: the name of the place that you want the weather for.\n        date_time: the date and time for which you want the report.\n    \"\"\"\n    lon, lat = convert_location_to_coordinates(location)\n    date_time = datetime.strptime(date_time)\n    return str(get_weather_report_at_coordinates((lon, lat), date_time))\n```\n\n为什么它不好？\n- 没有说明 `date_time` 应该使用的格式\n- 没有说明位置应该如何指定\n- 没有记录机制来处理明确的报错情况，如位置格式不正确或 date_time 格式不正确\n- 输出格式难以理解\n\n如果工具调用失败，内存中记录的错误跟踪，可以帮助 LLM 逆向工程工具来修复错误。但为什么要让它做这么多繁重的工作呢？\n\n构建这个工具的更好方式如下：\n```python\n@tool\ndef get_weather_api(location: str, date_time: str) -> str:\n    \"\"\"\n    Returns the weather report.\n\n    Args:\n        location: the name of the place that you want the weather for. Should be a place name, followed by possibly a city name, then a country, like \"Anchor Point, Taghazout, Morocco\".\n        date_time: the date and time for which you want the report, formatted as '%m/%d/%y %H:%M:%S'.\n    \"\"\"\n    lon, lat = convert_location_to_coordinates(location)\n    try:\n        date_time = datetime.strptime(date_time)\n    except Exception as e:\n        raise ValueError(\"Conversion of `date_time` to datetime format failed, make sure to provide a string in format '%m/%d/%y %H:%M:%S'. Full trace:\" + str(e))\n    temperature_celsius, risk_of_rain, wave_height = get_weather_report_at_coordinates((lon, lat), date_time)\n    return f\"Weather report for {location}, {date_time}: Temperature will be {temperature_celsius}°C, risk of rain is {risk_of_rain*100:.0f}%, wave height is {wave_height}m.\"\n```\n\n一般来说，为了减轻 LLM 的负担，要问自己的好问题是：\"如果我是一个第一次使用这个工具的傻瓜，使用这个工具编程并纠正自己的错误有多容易？\"。\n\n### 给 agent 更多参数\n\n除了简单的任务描述字符串外，你还可以使用 `additional_args` 参数传递任何类型的对象：\n\n```py\nfrom smolagents import CodeAgent, InferenceClientModel\n\nmodel_id = \"meta-llama/Llama-3.3-70B-Instruct\"\n\nagent = CodeAgent(tools=[], model=InferenceClientModel(model_id=model_id), add_base_tools=True)\n\nagent.run(\n    \"Why does Mike not know many people in New York?\",\n    additional_args={\"mp3_sound_file_url\":'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/recording.mp3'}\n)\n```\n例如，你可以使用这个 `additional_args` 参数传递你希望 agent 利用的图像或字符串。\n\n\n## 如何调试你的 agent\n\n### 1. 使用更强大的 LLM\n\n在 agent 工作流中，有些错误是实际错误，有些则是你的 LLM 引擎没有正确推理的结果。\n例如，参考这个我要求创建一个汽车图片的 `CodeAgent` 的运行记录：\n```text\n==================================================================================================== New task ====================================================================================================\nMake me a cool car picture\n──────────────────────────────────────────────────────────────────────────────────────────────────── New step ─────────────────────────────────────────────────────────────────────────────────────────────────────\nAgent is executing the code below: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\nimage_generator(prompt=\"A cool, futuristic sports car with LED headlights, aerodynamic design, and vibrant color, high-res, photorealistic\")\n──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\n\nLast output from code snippet: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\n/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png\nStep 1:\n\n- Time taken: 16.35 seconds\n- Input tokens: 1,383\n- Output tokens: 77\n──────────────────────────────────────────────────────────────────────────────────────────────────── New step ─────────────────────────────────────────────────────────────────────────────────────────────────────\nAgent is executing the code below: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\nfinal_answer(\"/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png\")\n──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\nPrint outputs:\n\nLast output from code snippet: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\n/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png\nFinal answer:\n/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png\n```\n用户看到的是返回了一个路径，而不是图像。\n这看起来像是系统的错误，但实际上 agent 系统并没有导致错误：只是 LLM 大脑犯了一个错误，没有把图像输出，保存到变量中。\n因此，它无法再次访问图像，只能利用保存图像时记录的路径，所以它返回的是路径，而不是图像。\n\n调试 agent 的第一步是\"使用更强大的 LLM\"。像 `Qwen2.5-72B-Instruct` 这样的替代方案不会犯这种错误。\n\n### 2. 提供更多指导/更多信息\n\n你也可以使用不太强大的模型，只要你更有效地指导它们。\n\n站在模型的角度思考：如果你是模型在解决任务，你会因为系统提示+任务表述+工具描述中提供的信息而挣扎吗？\n\n你需要一些额外的说明吗？\n\n为了提供额外信息，我们不建议立即更改系统提示：默认系统提示有许多调整，除非你非常了解提示，否则你很容易翻车。\n更好的指导 LLM 引擎的方法是：\n- 如果是关于要解决的任务：把所有细节添加到任务中。任务可以有几百页长。\n- 如果是关于如何使用工具：你的工具的 description 属性。\n\n\n### 3. 更改系统提示（通常不建议）\n\n如果上述说明不够，你可以更改系统提示。\n\n让我们看看它是如何工作的。例如，让我们检查 [`CodeAgent`] 的默认系统提示（下面的版本通过跳过零样本示例进行了缩短）。\n\n```python\nprint(agent.prompt_templates[\"system_prompt\"])\n```\n你会得到：\n```text\nYou are an expert assistant who can solve any task using code blobs. You will be given a task to solve as best you can.\nTo do so, you have been given access to a list of tools: these tools are basically Python functions which you can call with code.\nTo solve the task, you must plan forward to proceed in a series of steps, in a cycle of 'Thought:', 'Code:', and 'Observation:' sequences.\n\nAt each step, in the 'Thought:' sequence, you should first explain your reasoning towards solving the task and the tools that you want to use.\nThen in the 'Code:' sequence, you should write the code in simple Python. The code sequence must end with '<end_code>' sequence.\nDuring each intermediate step, you can use 'print()' to save whatever important information you will then need.\nThese print outputs will then appear in the 'Observation:' field, which will be available as input for the next step.\nIn the end you have to return a final answer using the `final_answer` tool.\n\nHere are a few examples using notional tools:\n---\nTask: \"Generate an image of the oldest person in this document.\"\n\nThought: I will proceed step by step and use the following tools: `document_qa` to find the oldest person in the document, then `image_generator` to generate an image according to the answer.\nCode:\n```py\nanswer = document_qa(document=document, question=\"Who is the oldest person mentioned?\")\nprint(answer)\n```<end_code>\nObservation: \"The oldest person in the document is John Doe, a 55 year old lumberjack living in Newfoundland.\"\n\nThought: I will now generate an image showcasing the oldest person.\nCode:\n```py\nimage = image_generator(\"A portrait of John Doe, a 55-year-old man living in Canada.\")\nfinal_answer(image)\n```<end_code>\n\n---\nTask: \"What is the result of the following operation: 5 + 3 + 1294.678?\"\n\nThought: I will use python code to compute the result of the operation and then return the final answer using the `final_answer` tool\nCode:\n```py\nresult = 5 + 3 + 1294.678\nfinal_answer(result)\n```<end_code>\n\n---\nTask:\n\"Answer the question in the variable `question` about the image stored in the variable `image`. The question is in French.\nYou have been provided with these additional arguments, that you can access using the keys as variables in your python code:\n{'question': 'Quel est l'animal sur l'image?', 'image': 'path/to/image.jpg'}\"\n\nThought: I will use the following tools: `translator` to translate the question into English and then `image_qa` to answer the question on the input image.\nCode:\n```py\ntranslated_question = translator(question=question, src_lang=\"French\", tgt_lang=\"English\")\nprint(f\"The translated question is {translated_question}.\")\nanswer = image_qa(image=image, question=translated_question)\nfinal_answer(f\"The answer is {answer}\")\n```<end_code>\n\n---\nTask:\nIn a 1979 interview, Stanislaus Ulam discusses with Martin Sherwin about other great physicists of his time, including Oppenheimer.\nWhat does he say was the consequence of Einstein learning too much math on his creativity, in one word?\n\nThought: I need to find and read the 1979 interview of Stanislaus Ulam with Martin Sherwin.\nCode:\n```py\npages = search(query=\"1979 interview Stanislaus Ulam Martin Sherwin physicists Einstein\")\nprint(pages)\n```<end_code>\nObservation:\nNo result found for query \"1979 interview Stanislaus Ulam Martin Sherwin physicists Einstein\".\n\nThought: The query was maybe too restrictive and did not find any results. Let's try again with a broader query.\nCode:\n```py\npages = search(query=\"1979 interview Stanislaus Ulam\")\nprint(pages)\n```<end_code>\nObservation:\nFound 6 pages:\n[Stanislaus Ulam 1979 interview](https://ahf.nuclearmuseum.org/voices/oral-histories/stanislaus-ulams-interview-1979/)\n\n[Ulam discusses Manhattan Project](https://ahf.nuclearmuseum.org/manhattan-project/ulam-manhattan-project/)\n\n(truncated)\n\nThought: I will read the first 2 pages to know more.\nCode:\n```py\nfor url in [\"https://ahf.nuclearmuseum.org/voices/oral-histories/stanislaus-ulams-interview-1979/\", \"https://ahf.nuclearmuseum.org/manhattan-project/ulam-manhattan-project/\"]:\n    whole_page = visit_webpage(url)\n    print(whole_page)\n    print(\"\\n\" + \"=\"*80 + \"\\n\")  # Print separator between pages\n```<end_code>\nObservation:\nManhattan Project Locations:\nLos Alamos, NM\nStanislaus Ulam was a Polish-American mathematician. He worked on the Manhattan Project at Los Alamos and later helped design the hydrogen bomb. In this interview, he discusses his work at\n(truncated)\n\nThought: I now have the final answer: from the webpages visited, Stanislaus Ulam says of Einstein: \"He learned too much mathematics and sort of diminished, it seems to me personally, it seems to me his purely physics creativity.\" Let's answer in one word.\nCode:\n```py\nfinal_answer(\"diminished\")\n```<end_code>\n\n---\nTask: \"Which city has the highest population: Guangzhou or Shanghai?\"\n\nThought: I need to get the populations for both cities and compare them: I will use the tool `search` to get the population of both cities.\nCode:\n```py\nfor city in [\"Guangzhou\", \"Shanghai\"]:\n    print(f\"Population {city}:\", search(f\"{city} population\")\n```<end_code>\nObservation:\nPopulation Guangzhou: ['Guangzhou has a population of 15 million inhabitants as of 2021.']\nPopulation Shanghai: '26 million (2019)'\n\nThought: Now I know that Shanghai has the highest population.\nCode:\n```py\nfinal_answer(\"Shanghai\")\n```<end_code>\n\n---\nTask: \"What is the current age of the pope, raised to the power 0.36?\"\n\nThought: I will use the tool `wiki` to get the age of the pope, and confirm that with a web search.\nCode:\n```py\npope_age_wiki = wiki(query=\"current pope age\")\nprint(\"Pope age as per wikipedia:\", pope_age_wiki)\npope_age_search = web_search(query=\"current pope age\")\nprint(\"Pope age as per google search:\", pope_age_search)\n```<end_code>\nObservation:\nPope age: \"The pope Francis is currently 88 years old.\"\n\nThought: I know that the pope is 88 years old. Let's compute the result using python code.\nCode:\n```py\npope_current_age = 88 ** 0.36\nfinal_answer(pope_current_age)\n```<end_code>\n\nAbove example were using notional tools that might not exist for you. On top of performing computations in the Python code snippets that you create, you only have access to these tools:\n{%- for tool in tools.values() %}\n- {{ tool.to_tool_calling_prompt() }}\n{%- endfor %}\n\n{%- if managed_agents and managed_agents.values() | list %}\nYou can also give tasks to team members.\nCalling a team member works similarly to calling a tool: provide the task description as the 'task' argument. Since this team member is a real human, be as detailed and verbose as necessary in your task description.\nYou can also include any relevant variables or context using the 'additional_args' argument.\nHere is a list of the team members that you can call:\n{%- for agent in managed_agents.values() %}\n- {{ agent.name }}: {{ agent.description }}\n{%- endfor %}\n{%- endif %}\n\nHere are the rules you should always follow to solve your task:\n1. Always provide a 'Thought:' sequence, and a 'Code:\\n```py' sequence ending with '```<end_code>' sequence, else you will fail.\n2. Use only variables that you have defined!\n3. Always use the right arguments for the tools. DO NOT pass the arguments as a dict as in 'answer = wiki({'query': \"What is the place where James Bond lives?\"})', but use the arguments directly as in 'answer = wiki(query=\"What is the place where James Bond lives?\")'.\n4. Take care to not chain too many sequential tool calls in the same code block, especially when the output format is unpredictable. For instance, a call to search has an unpredictable return format, so do not have another tool call that depends on its output in the same block: rather output results with print() to use them in the next block.\n5. Call a tool only when needed, and never re-do a tool call that you previously did with the exact same parameters.\n6. Don't name any new variable with the same name as a tool: for instance don't name a variable 'final_answer'.\n7. Never create any notional variables in our code, as having these in your logs will derail you from the true variables.\n8. You can use imports in your code, but only from the following list of modules: {{authorized_imports}}\n9. The state persists between code executions: so if in one step you've created variables or imported modules, these will all persist.\n10. Don't give up! You're in charge of solving the task, not providing directions to solve it.\n\nNow Begin! If you solve the task correctly, you will receive a reward of $1,000,000.\n```\n\n如你所见，有一些占位符，如 `\"{{ tool.description }}\"`：这些将在 agent 初始化时用于插入某些自动生成的工具或管理 agent 的描述。\n\n因此，虽然你可以通过将自定义提示作为参数传递给 `system_prompt` 参数来覆盖此系统提示模板，但你的新系统提示必须包含以下占位符：\n- 用于插入工具描述。\n  ```\n  {%- for tool in tools.values() %}\n  - {{ tool.to_tool_calling_prompt() }}\n  {%- endfor %}\n  ```\n- 用于插入 managed agent 的描述（如果有）。\n  ```\n  {%- if managed_agents and managed_agents.values() | list %}\n  You can also give tasks to team members.\n  Calling a team member works similarly to calling a tool: provide the task description as the 'task' argument. Since this team member is a real human, be as detailed and verbose as necessary in your task description.\n  You can also include any relevant variables or context using the 'additional_args' argument.\n  Here is a list of the team members that you can call:\n  {%- for agent in managed_agents.values() %}\n  - {{ agent.name }}: {{ agent.description }}\n  {%- endfor %}\n  {%- endif %}\n  ```\n- 仅限 `CodeAgent`：`\"{{authorized_imports}}\"` 用于插入授权导入列表。\n\n然后你可以根据如下，更改系统提示：\n\n```py\nagent.prompt_templates[\"system_prompt\"] = agent.prompt_templates[\"system_prompt\"] + \"\\nHere you go!\"\n```\n\n这也适用于 [`ToolCallingAgent`]。\n\n\n### 4. 额外规划\n\n我们提供了一个用于补充规划步骤的模型，agent 可以在正常操作步骤之间定期运行。在此步骤中，没有工具调用，LLM 只是被要求更新它知道的事实列表，并根据这些事实反推它应该采取的下一步。\n\n```py\nfrom smolagents import load_tool, CodeAgent, InferenceClientModel, WebSearchTool\nfrom dotenv import load_dotenv\n\nload_dotenv()\n\n# 从 Hub 导入工具\nimage_generation_tool = load_tool(\"m-ric/text-to-image\", trust_remote_code=True)\n\nsearch_tool = WebSearchTool()\n\nagent = CodeAgent(\n    tools=[search_tool],\n    model=InferenceClientModel(model_id=\"Qwen/Qwen2.5-72B-Instruct\"),\n    planning_interval=3 # 这是你激活规划的地方！\n)\n\n# 运行它！\nresult = agent.run(\n    \"How long would a cheetah at full speed take to run the length of Pont Alexandre III?\",\n)\n```"
  },
  {
    "path": "docs/source/zh/tutorials/inspect_runs.md",
    "content": "# 使用 OpenTelemetry 检查运行记录\n\n[[open-in-colab]]\n\n> [!TIP]\n> 如果您是初次构建Agent，建议先阅读 [Agent 入门指南](../conceptual_guides/intro_agents) 和 [smolagents 导览](../guided_tour)。\n\n## 为什么需要记录Agent运行？\n\n调试Agent运行过程具有挑战性。\n\n验证运行是否正常进行很困难，因为Agent的工作流程本身具有 [设计上的不可预测性](../conceptual_guides/intro_agents)（如果可预测，直接使用传统代码即可）。\n\n检查运行记录同样困难：多步骤的Agent往往会快速在控制台生成大量日志，而大多数错误只是\"LLM 低级错误\"类型的问题，通常LLM会在后续步骤中通过生成更好的代码或工具调用来自我修正。\n\n因此，在生产环境中使用监控工具记录Agent运行过程，对于后续检查和分析至关重要！\n\n我们采用 [OpenTelemetry](https://opentelemetry.io/) 标准来实现Agent运行监控。\n\n这意味着您只需添加少量监控代码，即可在正常运行Agent时自动记录所有信息到监控平台。以下是在不同OpenTelemetry后端实现此功能的示例：\n\n在监控平台上的展示效果如下：\n\n<div class=\"flex justify-center\">\n    <img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/inspect_run_phoenix.gif\"/>\n</div>\n\n\n## 使用 Arize AI Phoenix 配置遥测\n\n首先安装必要的软件包。这里我们选择安装 [Arize AI 的 Phoenix](https://github.com/Arize-ai/phoenix) 作为日志收集和检查方案，您也可以使用其他兼容 OpenTelemetry 的平台来完成收集与检查工作。\n\n```shell\npip install 'smolagents[telemetry]'\n```\n\n接着在后台运行日志收集器：\n\n```shell\npython -m phoenix.server.main serve\n```\n\n最后配置 `SmolagentsInstrumentor` 来追踪Agent活动，并将追踪数据发送至 Phoenix 默认端点：\n\n```python\nfrom phoenix.otel import register\nfrom openinference.instrumentation.smolagents import SmolagentsInstrumentor\n\nregister()\nSmolagentsInstrumentor().instrument()\n```\n\n完成上述配置后，即可正常运行您的Agent！\n\n```py\nfrom smolagents import (\n    CodeAgent,\n    ToolCallingAgent,\n    WebSearchTool,\n    VisitWebpageTool,\n    InferenceClientModel,\n)\n\nmodel = InferenceClientModel()\n\nsearch_agent = ToolCallingAgent(\n    tools=[WebSearchTool(), VisitWebpageTool()],\n    model=model,\n    name=\"search_agent\",\n    description=\"This is an agent that can do web search.\",\n)\n\nmanager_agent = CodeAgent(\n    tools=[],\n    model=model,\n    managed_agents=[search_agent],\n)\nmanager_agent.run(\n    \"If the US keeps its 2024 growth rate, how many years will it take for the GDP to double?\"\n)\n```\nVoilà!\n\n此时访问 `http://0.0.0.0:6006/projects/` 即可查看运行记录：\n\n<img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/inspect_run_phoenix.png\">\n\n如图所示，CodeAgent 调用了其托管的 ToolCallingAgent（注：托管Agent也可以是另一个 CodeAgent）执行美国2024年经济增长率的网络搜索。托管Agent返回报告后，管理Agent根据结果计算出经济翻倍周期！是不是很智能？\n\n## 使用 🪢 Langfuse 配置遥测\n\n本部分演示如何通过 `SmolagentsInstrumentor` 使用 **Langfuse** 监控和调试 Hugging Face **smolagents**。\n\n> **Langfuse 是什么？** [Langfuse](https://langfuse.com) 是面向LLM工程的开源平台，提供AI Agent的追踪与监控功能，帮助开发者调试、分析和优化产品。该平台通过原生集成、OpenTelemetry 和 SDKs 与各类工具框架对接。\n\n### 步骤 1: 安装依赖\n\n```python\n%pip install langfuse 'smolagents[telemetry]' openinference-instrumentation-smolagents\n```\n\n### 步骤 2: 配置环境变量\n\n设置 Langfuse API 密钥，并配置 OpenTelemetry 端点将追踪数据发送至 Langfuse。通过注册 [Langfuse Cloud](https://cloud.langfuse.com) 或 [自托管 Langfuse](https://langfuse.com/self-hosting) 获取 API 密钥。\n\n同时需添加 [Hugging Face 令牌](https://huggingface.co/settings/tokens) (`HF_TOKEN`) 作为环境变量：\n```python\nimport os\n# Get keys for your project from the project settings page: https://cloud.langfuse.com\nos.environ[\"LANGFUSE_PUBLIC_KEY\"] = \"pk-lf-...\" \nos.environ[\"LANGFUSE_SECRET_KEY\"] = \"sk-lf-...\" \nos.environ[\"LANGFUSE_HOST\"] = \"https://cloud.langfuse.com\" # 🇪🇺 EU region\n# os.environ[\"LANGFUSE_HOST\"] = \"https://us.cloud.langfuse.com\" # 🇺🇸 US region\n \n# your Hugging Face token\nos.environ[\"HF_TOKEN\"] = \"hf_...\"\n```\n\n```python\nfrom langfuse import get_client\n \nlangfuse = get_client()\n \n# Verify connection\nif langfuse.auth_check():\n    print(\"Langfuse client is authenticated and ready!\")\nelse:\n    print(\"Authentication failed. Please check your credentials and host.\")\n```\n\n### 步骤 3: 初始化 `SmolagentsInstrumentor`\n\n在应用程序代码执行前初始化 `SmolagentsInstrumentor`。\n\n\n```python\nfrom openinference.instrumentation.smolagents import SmolagentsInstrumentor\n \nSmolagentsInstrumentor().instrument()\n```\n\n### 步骤 4: 运行 smolagent\n\n```python\nfrom smolagents import (\n    CodeAgent,\n    ToolCallingAgent,\n    WebSearchTool,\n    VisitWebpageTool,\n    InferenceClientModel,\n)\n\nmodel = InferenceClientModel(\n    model_id=\"deepseek-ai/DeepSeek-R1-Distill-Qwen-32B\"\n)\n\nsearch_agent = ToolCallingAgent(\n    tools=[WebSearchTool(), VisitWebpageTool()],\n    model=model,\n    name=\"search_agent\",\n    description=\"This is an agent that can do web search.\",\n)\n\nmanager_agent = CodeAgent(\n    tools=[],\n    model=model,\n    managed_agents=[search_agent],\n)\nmanager_agent.run(\n    \"How can Langfuse be used to monitor and improve the reasoning and decision-making of smolagents when they execute multi-step tasks, like dynamically adjusting a recipe based on user feedback or available ingredients?\"\n)\n```\n\n### 步骤 5: 在 Langfuse 中查看追踪记录\n\n运行Agent后，您可以在 [Langfuse](https://cloud.langfuse.com) 平台查看 smolagents 应用生成的追踪记录。这些记录会详细展示LLM的交互步骤，帮助您调试和优化AI代理。\n\n![smolagents 追踪示例](https://langfuse.com/images/cookbook/integration-smolagents/smolagent_example_trace.png)\n\n_[Langfuse 公开示例追踪](https://cloud.langfuse.com/project/cloramnkj0002jz088vzn1ja4/traces/ce5160f9bfd5a6cd63b07d2bfcec6f54?timestamp=2025-02-11T09%3A25%3A45.163Z&display=details)_"
  },
  {
    "path": "docs/source/zh/tutorials/memory.md",
    "content": "# 📚 管理Agent的记忆\n\n[[open-in-colab]]\n\n归根结底，Agent可以定义为由几个简单组件构成：它拥有工具、提示词。最重要的是，它具备对过往步骤的记忆，能够追溯完整的规划、执行和错误历史。\n\n### 回放Agent的记忆\n\n我们提供了多项功能来审查Agent的过往运行记录。\n\n您可以通过插装（instrumentation）在可视化界面中查看Agent的运行过程，该界面支持对特定步骤进行缩放操作，具体方法参见[插装指南](./inspect_runs)。\n\n您也可以使用`agent.replay()`方法实现回放：\n\n当Agent完成运行后：\n```py\nfrom smolagents import InferenceClientModel, CodeAgent\n\nagent = CodeAgent(tools=[], model=InferenceClientModel(), verbosity_level=0)\n\nresult = agent.run(\"What's the 20th Fibonacci number?\")\n```\n\n若要回放最近一次运行，只需使用：\n```py\nagent.replay()\n```\n\n### 动态修改Agent的记忆\n\n许多高级应用场景需要对Agent的记忆进行动态修改。\n\n您可以通过以下方式访问Agent的记忆：\n\n```py\nfrom smolagents import ActionStep\n\nsystem_prompt_step = agent.memory.system_prompt\nprint(\"The system prompt given to the agent was:\")\nprint(system_prompt_step.system_prompt)\n\ntask_step = agent.memory.steps[0]\nprint(\"\\n\\nThe first task step was:\")\nprint(task_step.task)\n\nfor step in agent.memory.steps:\n    if isinstance(step, ActionStep):\n        if step.error is not None:\n            print(f\"\\nStep {step.step_number} got this error:\\n{step.error}\\n\")\n        else:\n            print(f\"\\nStep {step.step_number} got these observations:\\n{step.observations}\\n\")\n```\n\n使用`agent.memory.get_full_steps()`可获取完整步骤字典数据。\n\n您还可以通过步骤回调（step callbacks）实现记忆的动态修改。\n\n步骤回调函数可通过参数直接访问`agent`对象，因此能够访问所有记忆步骤并根据需要进行修改。例如，假设您正在监控网页浏览Agent每个步骤的屏幕截图，希望保留最新截图同时删除旧步骤的图片以节省token消耗。\n\n可参考以下代码示例：\n_注：此代码片段不完整，部分导入语句和对象定义已精简，完整代码请访问[原始脚本](https://github.com/huggingface/smolagents/blob/main/src/smolagents/vision_web_browser.py)_\n\n```py\nimport helium\nfrom PIL import Image\nfrom io import BytesIO\nfrom time import sleep\n\ndef update_screenshot(memory_step: ActionStep, agent: CodeAgent) -> None:\n    sleep(1.0)  # Let JavaScript animations happen before taking the screenshot\n    driver = helium.get_driver()\n    latest_step = memory_step.step_number\n    for previous_memory_step in agent.memory.steps:  # Remove previous screenshots from logs for lean processing\n        if isinstance(previous_memory_step, ActionStep) and previous_memory_step.step_number <= latest_step - 2:\n            previous_memory_step.observations_images = None\n    png_bytes = driver.get_screenshot_as_png()\n    image = Image.open(BytesIO(png_bytes))\n    memory_step.observations_images = [image.copy()]\n```\n\n最后在初始化Agent时，将此函数传入`step_callbacks`参数：\n\n```py\nCodeAgent(\n    tools=[WebSearchTool(), go_back, close_popups, search_item_ctrl_f],\n    model=model,\n    additional_authorized_imports=[\"helium\"],\n    step_callbacks=[update_screenshot],\n    max_steps=20,\n    verbosity_level=2,\n)\n```\n\n请访问我们的 [vision web browser code](https://github.com/huggingface/smolagents/blob/main/src/smolagents/vision_web_browser.py) 查看完整可运行示例。\n\n### 分步运行 Agents\n\n当您需要处理耗时数天的工具调用时，这种方式特别有用：您可以逐步执行Agents。这还允许您在每一步更新记忆。\n\n```py\nfrom smolagents import InferenceClientModel, CodeAgent, ActionStep, TaskStep\n\nagent = CodeAgent(tools=[], model=InferenceClientModel(), verbosity_level=1)\nprint(agent.memory.system_prompt)\n\ntask = \"What is the 20th Fibonacci number?\"\n\n# You could modify the memory as needed here by inputting the memory of another agent.\n# agent.memory.steps = previous_agent.memory.steps\n\n# Let's start a new task!\nagent.memory.steps.append(TaskStep(task=task, task_images=[]))\n\nfinal_answer = None\nstep_number = 1\nwhile final_answer is None and step_number <= 10:\n    memory_step = ActionStep(\n        step_number=step_number,\n        observations_images=[],\n    )\n    # Run one step.\n    final_answer = agent.step(memory_step)\n    agent.memory.steps.append(memory_step)\n    step_number += 1\n\n    # Change the memory as you please!\n    # For instance to update the latest step:\n    # agent.memory.steps[-1] = ...\n\nprint(\"The final answer is:\", final_answer)\n```"
  },
  {
    "path": "docs/source/zh/tutorials/secure_code_execution.md",
    "content": "# 安全代码执行\n\n[[open-in-colab]]\n\n> [!TIP]\n> 如果你是第一次构建 agent，请先阅读 [agent 介绍](../conceptual_guides/intro_agents) 和 [smolagents 导览](../guided_tour)。\n\n### 代码智能体\n\n[多项](https://huggingface.co/papers/2402.01030) [研究](https://huggingface.co/papers/2411.01747) [表明](https://huggingface.co/papers/2401.00812)，让大语言模型用代码编写其动作（工具调用）比当前标准的工具调用格式要好得多，目前行业标准是 \"将动作写成包含工具名称和参数的 JSON\" 的各种变体。\n\n为什么代码更好？因为我们专门为计算机执行的动作而设计编程语言。如果 JSON 片段是更好的方式，那么这个工具包就应该是用 JSON 片段编写的，魔鬼就会嘲笑我们。\n\n代码就是表达计算机动作的更好方式。它具有更好的：\n- **组合性**：你能像定义 Python 函数那样，在 JSON 动作中嵌套其他 JSON 动作，或者定义一组 JSON 动作以便以后重用吗？\n- **对象管理**：你如何在 JSON 中存储像 `generate_image` 这样的动作的输出？\n- **通用性**：代码是为了简单地表达任何可以让计算机做的事情而构建的。\n- **在 LLM 训练语料库中的表示**：天赐良机，为什么不利用已经包含在 LLM 训练语料库中的大量高质量动作呢？\n\n下图展示了这一点，取自 [可执行代码动作引出更好的 LLM 智能体](https://huggingface.co/papers/2402.01030)。\n\n<img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/code_vs_json_actions.png\">\n\n这就是为什么我们强调提出代码智能体，在本例中是 Python 智能体，这意味着我们要在构建安全的 Python 解释器上投入更多精力。\n\n### 本地 Python 解释器\n\n默认情况下，`CodeAgent` 会在你的环境中运行 LLM 生成的代码。\n这个执行不是由普通的 Python 解释器完成的：我们从零开始重新构建了一个更安全的 `LocalPythonExecutor`。\n这个解释器通过以下方式设计以确保安全：\n  - 将导入限制为用户显式传递的列表\n  - 限制操作次数以防止无限循环和资源膨胀\n  - 不会执行任何未预定义的操作\n\n我们已经在许多用例中使用了这个解释器，从未观察到对环境造成任何损害。\n\n然而，这个解决方案并不是万无一失的：可以想象，如果 LLM 被微调用于恶意操作，仍然可能损害你的环境。例如，如果你允许像 `Pillow` 这样无害的包处理图像，LLM 可能会生成数千张图像保存以膨胀你的硬盘。\n如果你自己选择了 LLM 引擎，这当然不太可能，但它可能会发生。\n\n所以如果你想格外谨慎，可以使用下面描述的远程代码执行选项。\n\n### E2B 代码执行器\n\n为了最大程度的安全性，你可以使用我们与 E2B 的集成在沙盒环境中运行代码。这是一个远程执行服务，可以在隔离的容器中运行你的代码，使代码无法影响你的本地环境。\n\n为此，你需要设置你的 E2B 账户并在环境变量中设置 `E2B_API_KEY`。请前往 [E2B 快速入门文档](https://e2b.dev/docs/quickstart) 了解更多信息。\n\n然后你可以通过 `pip install e2b-code-interpreter python-dotenv` 安装它。\n\n现在你已经准备好了！\n\n要将代码执行器设置为 E2B，只需在初始化 `CodeAgent` 时传递标志 `executor_type=\"e2b\"`。\n请注意，你应该将所有工具的依赖项添加到 `additional_authorized_imports` 中，以便执行器安装它们。\n\n```py\nfrom smolagents import CodeAgent, VisitWebpageTool, InferenceClientModel\nagent = CodeAgent(\n    tools = [VisitWebpageTool()],\n    model=InferenceClientModel(),\n    additional_authorized_imports=[\"requests\", \"markdownify\"],\n    executor_type=\"e2b\"\n)\n\nagent.run(\"What was Abraham Lincoln's preferred pet?\")\n```\n\n目前 E2B 代码执行暂不兼容多 agent——因为把 agent 调用放在应该在远程执行的代码块里，是非常混乱的。但我们正在努力做到这件事！\n"
  },
  {
    "path": "docs/source/zh/tutorials/tools.md",
    "content": "# 工具\n\n[[open-in-colab]]\n\n在这里，我们将学习高级工具的使用。\n\n> [!TIP]\n> 如果你是构建 agent 的新手，请确保先阅读 [agent 介绍](../conceptual_guides/intro_agents) 和 [smolagents 导览](../guided_tour)。\n\n- [工具](#工具)\n    - [什么是工具，如何构建一个工具？](#什么是工具如何构建一个工具)\n    - [将你的工具分享到 Hub](#将你的工具分享到-hub)\n    - [将 Space 导入为工具](#将-space-导入为工具)\n    - [使用 LangChain 工具](#使用-langchain-工具)\n    - [管理你的 agent 工具箱](#管理你的-agent-工具箱)\n    - [使用工具集合](#使用工具集合)\n\n### 什么是工具，如何构建一个工具？\n\n工具主要是 LLM 可以在 agent 系统中使用的函数。\n\n但要使用它，LLM 需要被提供一个 API：名称、工具描述、输入类型和描述、输出类型。\n\n所以它不能仅仅是一个函数。它应该是一个类。\n\n因此，核心上，工具是一个类，它包装了一个函数，并带有帮助 LLM 理解如何使用它的元数据。\n\n以下是它的结构：\n\n```python\nfrom smolagents import Tool\n\nclass HFModelDownloadsTool(Tool):\n    name = \"model_download_counter\"\n    description = \"\"\"\n    This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub.\n    It returns the name of the checkpoint.\"\"\"\n    inputs = {\n        \"task\": {\n            \"type\": \"string\",\n            \"description\": \"the task category (such as text-classification, depth-estimation, etc)\",\n        }\n    }\n    output_type = \"string\"\n\n    def forward(self, task: str):\n        from huggingface_hub import list_models\n\n        model = next(iter(list_models(filter=task, sort=\"downloads\", direction=-1)))\n        return model.id\n\nmodel_downloads_tool = HFModelDownloadsTool()\n```\n\n自定义工具继承 [`Tool`] 以继承有用的方法。子类还定义了：\n- 一个属性 `name`，对应于工具本身的名称。名称通常描述工具的功能。由于代码返回任务中下载量最多的模型，我们将其命名为 `model_download_counter`。\n- 一个属性 `description`，用于填充 agent 的系统提示。\n- 一个 `inputs` 属性，它是一个带有键 `\"type\"` 和 `\"description\"` 的字典。它包含帮助 Python 解释器对输入做出明智选择的信息。\n- 一个 `output_type` 属性，指定输出类型。`inputs` 和 `output_type` 的类型应为 [Pydantic 格式](https://docs.pydantic.dev/latest/concepts/json_schema/#generating-json-schema)，它们可以是以下之一：`[\"string\", \"boolean\",\"integer\", \"number\", \"image\", \"audio\", \"array\", \"object\", \"any\", \"null\"]`。\n- 一个 `forward` 方法，包含要执行的推理代码。\n\n这就是它在 agent 中使用所需的全部内容！\n\n还有另一种构建工具的方法。在 [guided_tour](../guided_tour) 中，我们使用 `@tool` 装饰器实现了一个工具。[`tool`] 装饰器是定义简单工具的推荐方式，但有时你需要更多：在类中使用多个方法以获得更清晰的代码，或使用额外的类属性。\n\n在这种情况下，你可以通过如上所述继承 [`Tool`] 来构建你的工具。\n\n### 将你的工具分享到 Hub\n\n你可以通过调用 [`~Tool.push_to_hub`] 将你的自定义工具分享到 Hub。确保你已经在 Hub 上为其创建了一个仓库，并且使用的是具有读取权限的 token。\n\n```python\nmodel_downloads_tool.push_to_hub(\"{your_username}/hf-model-downloads\", token=\"<YOUR_HUGGINGFACEHUB_API_TOKEN>\")\n```\n\n为了使推送到 Hub 正常工作，你的工具需要遵守一些规则：\n- 所有方法都是自包含的，例如使用来自其参数中的变量。\n- 根据上述要点，**所有导入应直接在工具的函数中定义**，否则在尝试使用 [`~Tool.save`] 或 [`~Tool.push_to_hub`] 调用你的自定义工具时会出现错误。\n- 如果你继承了 `__init__` 方法，除了 `self` 之外，你不能给它任何其他参数。这是因为在特定工具实例初始化期间设置的参数很难跟踪，这阻碍了将它们正确分享到 Hub。无论如何，创建特定类的想法是你已经可以为任何需要硬编码的内容设置类属性（只需在 `class YourTool(Tool):` 行下直接设置 `your_variable=(...)`）。当然，你仍然可以通过将内容分配给 `self.your_variable` 在代码中的任何地方创建类属性。\n\n一旦你的工具被推送到 Hub，你就可以查看它。[这里](https://huggingface.co/spaces/m-ric/hf-model-downloads) 是我推送的 `model_downloads_tool`。它有一个漂亮的 gradio 界面。\n\n在深入工具文件时，你可以发现所有工具的逻辑都在 [tool.py](https://huggingface.co/spaces/m-ric/hf-model-downloads/blob/main/tool.py) 下。这是你可以检查其他人分享的工具的地方。\n\n然后你可以使用 [`load_tool`] 加载工具或使用 [`~Tool.from_hub`] 创建它，并将其传递给 agent 中的 `tools` 参数。\n由于运行工具意味着运行自定义代码，你需要确保你信任该仓库，因此我们需要传递 `trust_remote_code=True` 来从 Hub 加载工具。\n\n```python\nfrom smolagents import load_tool, CodeAgent\n\nmodel_download_tool = load_tool(\n    \"{your_username}/hf-model-downloads\",\n    trust_remote_code=True\n)\n```\n\n### 将 Space 导入为工具\n\n你可以使用 [`Tool.from_space`] 方法直接从 Hub 导入一个 Space 作为工具！\n\n你只需要提供 Hub 上 Space 的 id、它的名称和一个帮助你的 agent 理解工具功能的描述。在底层，这将使用 [`gradio-client`](https://pypi.org/project/gradio-client/) 库来调用 Space。\n\n例如，让我们从 Hub 导入 [FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) Space 并使用它生成一张图片。\n\n```python\nimage_generation_tool = Tool.from_space(\n    \"black-forest-labs/FLUX.1-schnell\",\n    name=\"image_generator\",\n    description=\"Generate an image from a prompt\"\n)\n\nimage_generation_tool(\"A sunny beach\")\n```\n瞧，这是你的图片！🏖️\n\n<img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/sunny_beach.webp\">\n\n然后你可以像使用任何其他工具一样使用这个工具。例如，让我们改进提示 `A rabbit wearing a space suit` 并生成它的图片。\n\n```python\nfrom smolagents import CodeAgent, InferenceClientModel\n\nmodel = InferenceClientModel(model_id=\"Qwen/Qwen3-Next-80B-A3B-Thinking\")\nagent = CodeAgent(tools=[image_generation_tool], model=model)\n\nagent.run(\n    \"Improve this prompt, then generate an image of it.\", additional_args={'user_prompt': 'A rabbit wearing a space suit'}\n)\n```\n\n```text\n=== Agent thoughts:\nimproved_prompt could be \"A bright blue space suit wearing rabbit, on the surface of the moon, under a bright orange sunset, with the Earth visible in the background\"\n\nNow that I have improved the prompt, I can use the image generator tool to generate an image based on this prompt.\n>>> Agent is executing the code below:\nimage = image_generator(prompt=\"A bright blue space suit wearing rabbit, on the surface of the moon, under a bright orange sunset, with the Earth visible in the background\")\nfinal_answer(image)\n```\n\n<img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit_spacesuit_flux.webp\">\n\n这得有多酷？🤩\n\n### 使用 LangChain 工具\n\n我们喜欢 Langchain，并认为它有一套非常吸引人的工具。\n要从 LangChain 导入工具，请使用 `from_langchain()` 方法。\n\n以下是如何使用它来重现介绍中的搜索结果，使用 LangChain 的 web 搜索工具。\n这个工具需要 `pip install langchain google-search-results -q` 才能正常工作。\n```python\nfrom langchain.agents import load_tools\n\nsearch_tool = Tool.from_langchain(load_tools([\"serpapi\"])[0])\n\nagent = CodeAgent(tools=[search_tool], model=model)\n\nagent.run(\"How many more blocks (also denoted as layers) are in BERT base encoder compared to the encoder from the architecture proposed in Attention is All You Need?\")\n```\n\n### 管理你的 agent 工具箱\n\n你可以通过添加或替换工具来管理 agent 的工具箱。\n\n让我们将 `model_download_tool` 添加到一个仅使用默认工具箱初始化的现有 agent 中。\n\n```python\nfrom smolagents import InferenceClientModel\n\nmodel = InferenceClientModel(model_id=\"Qwen/Qwen3-Next-80B-A3B-Thinking\")\n\nagent = CodeAgent(tools=[], model=model, add_base_tools=True)\nagent.tools[model_download_tool.name] = model_download_tool\n```\n现在我们可以利用新工具：\n\n```python\nagent.run(\n    \"Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub but reverse the letters?\"\n)\n```\n\n\n> [!TIP]\n> 注意不要向 agent 添加太多工具：这可能会让较弱的 LLM 引擎不堪重负。\n\n\n### 使用工具集合\n\n你可以通过使用 ToolCollection 对象来利用工具集合，使用你想要使用的集合的 slug。\n然后将它们作为列表传递给 agent 初始化，并开始使用它们！\n\n```py\nfrom smolagents import ToolCollection, CodeAgent\n\nimage_tool_collection = ToolCollection.from_hub(\n    collection_slug=\"huggingface-tools/diffusion-tools-6630bb19a942c2306a2cdb6f\",\n    token=\"<YOUR_HUGGINGFACEHUB_API_TOKEN>\"\n)\nagent = CodeAgent(tools=[*image_tool_collection.tools], model=model, add_base_tools=True)\n\nagent.run(\"Please draw me a picture of rivers and lakes.\")\n```\n\n为了加快启动速度，工具仅在 agent 调用时加载。\n"
  },
  {
    "path": "e2b.toml",
    "content": "# This is a config for E2B sandbox template.\n# You can use template ID (qywp2ctmu2q7jzprcf4j) to create a sandbox:\n\n# Python SDK\n# from e2b import Sandbox, AsyncSandbox\n# sandbox = Sandbox(\"qywp2ctmu2q7jzprcf4j\") # Sync sandbox\n# sandbox = await AsyncSandbox.create(\"qywp2ctmu2q7jzprcf4j\") # Async sandbox\n\n# JS SDK\n# import { Sandbox } from 'e2b'\n# const sandbox = await Sandbox.create('qywp2ctmu2q7jzprcf4j')\n\nteam_id = \"f8776d3a-df2f-4a1d-af48-68c2e13b3b87\"\nstart_cmd = \"/root/.jupyter/start-up.sh\"\ndockerfile = \"e2b.Dockerfile\"\ntemplate_id = \"qywp2ctmu2q7jzprcf4j\"\n"
  },
  {
    "path": "examples/agent_from_any_llm.py",
    "content": "from smolagents import (\n    CodeAgent,\n    InferenceClientModel,\n    LiteLLMModel,\n    OpenAIModel,\n    ToolCallingAgent,\n    TransformersModel,\n    tool,\n)\n\n\n# Choose which inference type to use!\n\navailable_inferences = [\"inference_client\", \"transformers\", \"ollama\", \"litellm\", \"openai\"]\nchosen_inference = \"inference_client\"\n\nprint(f\"Chose model: '{chosen_inference}'\")\n\nif chosen_inference == \"inference_client\":\n    model = InferenceClientModel(model_id=\"meta-llama/Llama-3.3-70B-Instruct\", provider=\"nebius\")\n\nelif chosen_inference == \"transformers\":\n    model = TransformersModel(model_id=\"HuggingFaceTB/SmolLM2-1.7B-Instruct\", device_map=\"auto\", max_new_tokens=1000)\n\nelif chosen_inference == \"ollama\":\n    model = LiteLLMModel(\n        model_id=\"ollama_chat/llama3.2\",\n        api_base=\"http://localhost:11434\",  # replace with remote open-ai compatible server if necessary\n        api_key=\"your-api-key\",  # replace with API key if necessary\n        num_ctx=8192,  # ollama default is 2048 which will often fail horribly. 8192 works for easy tasks, more is better. Check https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator to calculate how much VRAM this will need for the selected model.\n    )\n\nelif chosen_inference == \"litellm\":\n    # For anthropic: change model_id below to 'anthropic/claude-3-5-sonnet-latest'\n    model = LiteLLMModel(model_id=\"gpt-4o\")\n\nelif chosen_inference == \"openai\":\n    # For anthropic: change model_id below to 'anthropic/claude-3-5-sonnet-latest'\n    model = OpenAIModel(model_id=\"gpt-4o\")\n\n\n@tool\ndef get_weather(location: str, celsius: bool | None = False) -> str:\n    \"\"\"\n    Get weather in the next days at given location.\n    Secretly this tool does not care about the location, it hates the weather everywhere.\n\n    Args:\n        location: the location\n        celsius: the temperature\n    \"\"\"\n    return \"The weather is UNGODLY with torrential rains and temperatures below -10°C\"\n\n\nagent = ToolCallingAgent(tools=[get_weather], model=model, verbosity_level=2)\n\nprint(\"ToolCallingAgent:\", agent.run(\"What's the weather like in Paris?\"))\n\nagent = CodeAgent(tools=[get_weather], model=model, verbosity_level=2, stream_outputs=True)\n\nprint(\"CodeAgent:\", agent.run(\"What's the weather like in Paris?\"))\n"
  },
  {
    "path": "examples/async_agent/README.md",
    "content": "# Async Applications with Agents\n\nThis example demonstrates how to use a `CodeAgent` from the `smolagents` library in an asynchronous Starlette web application.\nThe agent is executed in a background thread using `anyio.to_thread.run_sync`, allowing you to integrate synchronous agent logic into an async web server.\n\n## Key Concepts\n\n- **Starlette**: A lightweight ASGI framework for building async web apps.\n- **anyio.to_thread.run_sync**: Runs blocking (sync) code in a thread, so it doesn't block the async event loop.\n- **CodeAgent**: An agent from the `smolagents` library that can be used to solve tasks programmatically.\n\n## How it works\n\n- The Starlette app exposes a `/run-agent` endpoint that accepts a JSON payload with a `task` string.\n- When a request is received, the agent is run in a background thread using `anyio.to_thread.run_sync`.\n- The result is returned as a JSON response.\n\n## Implementation Note\n\n**Why use a background thread?** \n\n`CodeAgent.run()` executes Python code synchronously, which would block Starlette's async event loop if called directly. By offloading this synchronous operation to a separate thread with `anyio.to_thread.run_sync`, we maintain the application's responsiveness while the agent processes requests, ensuring optimal performance in high-concurrency scenarios.\n\n## Usage\n\n1. **Install dependencies**:\n   ```bash\n   pip install smolagents starlette anyio uvicorn\n   ```\n\n2. **Run the app**:\n   ```bash\n   uvicorn async_codeagent_starlette.main:app --reload\n   ```\n\n3. **Test the endpoint**:\n   ```bash\n   curl -X POST http://localhost:8000/run-agent -H 'Content-Type: application/json' -d '{\"task\": \"What is 2+2?\"}'\n   ```\n\n## Files\n\n- `main.py`: Main Starlette application with async endpoint using CodeAgent.\n- `README.md`: This file.\n\n---\nThis example is designed to be clear and didactic for users new to async Python and agent integration.\n"
  },
  {
    "path": "examples/async_agent/main.py",
    "content": "\"\"\"\nAsync CodeAgent Example with Starlette\n\nThis example demonstrates how to use a CodeAgent in an async Starlette app,\nrunning the agent in a background thread using anyio.to_thread.run_sync.\n\"\"\"\n\nimport anyio.to_thread\nfrom starlette.applications import Starlette\nfrom starlette.requests import Request\nfrom starlette.responses import JSONResponse\nfrom starlette.routing import Route\n\nfrom smolagents import CodeAgent, InferenceClientModel\n\n\n# Create a simple agent instance (customize as needed)\ndef get_agent():\n    # You can set custom model, or tools as needed\n    return CodeAgent(\n        model=InferenceClientModel(model_id=\"Qwen/Qwen3-Next-80B-A3B-Thinking\"),\n        tools=[],\n    )\n\n\nasync def run_agent_in_thread(task: str):\n    agent = get_agent()\n    # The agent's run method is synchronous\n    result = await anyio.to_thread.run_sync(agent.run, task)\n    return result\n\n\nasync def run_agent_endpoint(request: Request):\n    data = await request.json()\n    task = data.get(\"task\")\n    if not task:\n        return JSONResponse({\"error\": 'Missing \"task\" in request body.'}, status_code=400)\n    try:\n        result = await run_agent_in_thread(task)\n        return JSONResponse({\"result\": result})\n    except Exception as e:\n        return JSONResponse({\"error\": str(e)}, status_code=500)\n\n\nroutes = [\n    Route(\"/run-agent\", run_agent_endpoint, methods=[\"POST\"]),\n]\n\napp = Starlette(debug=True, routes=routes)\n"
  },
  {
    "path": "examples/async_agent/requirements.txt",
    "content": "smolagents\nstarlette\nanyio\nuvicorn\n"
  },
  {
    "path": "examples/gradio_ui.py",
    "content": "from smolagents import CodeAgent, GradioUI, InferenceClientModel, WebSearchTool\n\n\nagent = CodeAgent(\n    tools=[WebSearchTool()],\n    model=InferenceClientModel(model_id=\"meta-llama/Llama-3.3-70B-Instruct\", provider=\"fireworks-ai\"),\n    verbosity_level=1,\n    planning_interval=3,\n    name=\"example_agent\",\n    description=\"This is an example agent.\",\n    step_callbacks=[],\n    stream_outputs=True,\n    # use_structured_outputs_internally=True,\n)\n\nGradioUI(agent, file_upload_folder=\"./data\").launch()\n"
  },
  {
    "path": "examples/inspect_multiagent_run.py",
    "content": "from openinference.instrumentation.smolagents import SmolagentsInstrumentor\nfrom phoenix.otel import register\n\n\nregister()\nSmolagentsInstrumentor().instrument(skip_dep_check=True)\n\n\nfrom smolagents import (\n    CodeAgent,\n    InferenceClientModel,\n    ToolCallingAgent,\n    VisitWebpageTool,\n    WebSearchTool,\n)\n\n\n# Then we run the agentic part!\nmodel = InferenceClientModel(provider=\"nebius\")\n\nsearch_agent = ToolCallingAgent(\n    tools=[WebSearchTool(), VisitWebpageTool()],\n    model=model,\n    name=\"search_agent\",\n    description=\"This is an agent that can do web search.\",\n    return_full_result=True,\n)\n\nmanager_agent = CodeAgent(\n    tools=[],\n    model=model,\n    managed_agents=[search_agent],\n    return_full_result=True,\n)\nrun_result = manager_agent.run(\n    \"If the US keeps it 2024 growth rate, how many years would it take for the GDP to double?\"\n)\nprint(\"Here is the token usage for the manager agent\", run_result.token_usage)\nprint(\"Here are the timing informations for the manager agent:\", run_result.timing)\n"
  },
  {
    "path": "examples/multi_llm_agent.py",
    "content": "import os\n\nfrom smolagents import CodeAgent, LiteLLMRouterModel, WebSearchTool\n\n\n# Make sure to setup the necessary environment variables!\n\nllm_loadbalancer_model_list = [\n    {\n        \"model_name\": \"model-group-1\",\n        \"litellm_params\": {\n            \"model\": \"gpt-4o-mini\",\n            \"api_key\": os.getenv(\"OPENAI_API_KEY\"),\n        },\n    },\n    {\n        \"model_name\": \"model-group-1\",\n        \"litellm_params\": {\n            \"model\": \"bedrock/anthropic.claude-3-sonnet-20240229-v1:0\",\n            \"aws_access_key_id\": os.getenv(\"AWS_ACCESS_KEY_ID\"),\n            \"aws_secret_access_key\": os.getenv(\"AWS_SECRET_ACCESS_KEY\"),\n            \"aws_region_name\": os.getenv(\"AWS_REGION\"),\n        },\n    },\n    # {\n    #     \"model_name\": \"model-group-2\",\n    #     \"litellm_params\": {\n    #         \"model\": \"bedrock/anthropic.claude-3-sonnet-20240229-v1:0\",\n    #         \"aws_access_key_id\": os.getenv(\"AWS_ACCESS_KEY_ID\"),\n    #         \"aws_secret_access_key\": os.getenv(\"AWS_SECRET_ACCESS_KEY\"),\n    #         \"aws_region_name\": os.getenv(\"AWS_REGION\"),\n    #     },\n    # },\n]\n\n\nmodel = LiteLLMRouterModel(\n    model_id=\"model-group-1\",\n    model_list=llm_loadbalancer_model_list,\n    client_kwargs={\"routing_strategy\": \"simple-shuffle\"},\n)\nagent = CodeAgent(tools=[WebSearchTool()], model=model, stream_outputs=True, return_full_result=True)\n\nfull_result = agent.run(\"How many seconds would it take for a leopard at full speed to run through Pont des Arts?\")\n\nprint(full_result)\n"
  },
  {
    "path": "examples/multiple_tools.py",
    "content": "import requests\n\n# from smolagents.agents import ToolCallingAgent\nfrom smolagents import CodeAgent, InferenceClientModel, tool\n\n\n# Choose which LLM engine to use!\nmodel = InferenceClientModel()\n# model = TransformersModel(model_id=\"meta-llama/Llama-3.2-2B-Instruct\")\n\n# For anthropic: change model_id below to 'anthropic/claude-3-5-sonnet-20240620'\n# model = LiteLLMModel(model_id=\"gpt-5\")\n\n\n@tool\ndef get_weather(location: str, celsius: bool | None = False) -> str:\n    \"\"\"\n    Get the current weather at the given location using the WeatherStack API.\n\n    Args:\n        location: The location (city name).\n        celsius: Whether to return the temperature in Celsius (default is False, which returns Fahrenheit).\n\n    Returns:\n        A string describing the current weather at the location.\n    \"\"\"\n    api_key = \"your_api_key\"  # Replace with your API key from https://weatherstack.com/\n    units = \"m\" if celsius else \"f\"  # 'm' for Celsius, 'f' for Fahrenheit\n\n    url = f\"http://api.weatherstack.com/current?access_key={api_key}&query={location}&units={units}\"\n\n    try:\n        response = requests.get(url)\n        response.raise_for_status()  # Raise an exception for HTTP errors\n\n        data = response.json()\n\n        if data.get(\"error\"):  # Check if there's an error in the response\n            return f\"Error: {data['error'].get('info', 'Unable to fetch weather data.')}\"\n\n        weather = data[\"current\"][\"weather_descriptions\"][0]\n        temp = data[\"current\"][\"temperature\"]\n        temp_unit = \"°C\" if celsius else \"°F\"\n\n        return f\"The current weather in {location} is {weather} with a temperature of {temp} {temp_unit}.\"\n\n    except requests.exceptions.RequestException as e:\n        return f\"Error fetching weather data: {str(e)}\"\n\n\n@tool\ndef convert_currency(amount: float, from_currency: str, to_currency: str) -> str:\n    \"\"\"\n    Converts a specified amount from one currency to another using the ExchangeRate-API.\n\n    Args:\n        amount: The amount of money to convert.\n        from_currency: The currency code of the currency to convert from (e.g., 'USD').\n        to_currency: The currency code of the currency to convert to (e.g., 'EUR').\n\n    Returns:\n        str: A string describing the converted amount in the target currency, or an error message if the conversion fails.\n\n    Raises:\n        requests.exceptions.RequestException: If there is an issue with the HTTP request to the ExchangeRate-API.\n    \"\"\"\n    api_key = \"your_api_key\"  # Replace with your actual API key from https://www.exchangerate-api.com/\n    url = f\"https://v6.exchangerate-api.com/v6/{api_key}/latest/{from_currency}\"\n\n    try:\n        response = requests.get(url)\n        response.raise_for_status()\n\n        data = response.json()\n        exchange_rate = data[\"conversion_rates\"].get(to_currency)\n\n        if not exchange_rate:\n            return f\"Error: Unable to find exchange rate for {from_currency} to {to_currency}.\"\n\n        converted_amount = amount * exchange_rate\n        return f\"{amount} {from_currency} is equal to {converted_amount} {to_currency}.\"\n\n    except requests.exceptions.RequestException as e:\n        return f\"Error fetching conversion data: {str(e)}\"\n\n\n@tool\ndef get_news_headlines() -> str:\n    \"\"\"\n    Fetches the top news headlines from the News API for the United States.\n    This function makes a GET request to the News API to retrieve the top news headlines\n    for the United States. It returns the titles and sources of the top 5 articles as a\n    formatted string. If no articles are available, it returns a message indicating that\n    no news is available. In case of a request error, it returns an error message.\n    Returns:\n        str: A string containing the top 5 news headlines and their sources, or an error message.\n    \"\"\"\n    api_key = \"your_api_key\"  # Replace with your actual API key from https://newsapi.org/\n    url = f\"https://newsapi.org/v2/top-headlines?country=us&apiKey={api_key}\"\n\n    try:\n        response = requests.get(url)\n        response.raise_for_status()\n\n        data = response.json()\n        articles = data[\"articles\"]\n\n        if not articles:\n            return \"No news available at the moment.\"\n\n        headlines = [f\"{article['title']} - {article['source']['name']}\" for article in articles[:5]]\n        return \"\\n\".join(headlines)\n\n    except requests.exceptions.RequestException as e:\n        return f\"Error fetching news data: {str(e)}\"\n\n\n@tool\ndef get_joke() -> str:\n    \"\"\"\n    Fetches a random joke from the JokeAPI.\n    This function sends a GET request to the JokeAPI to retrieve a random joke.\n    It handles both single jokes and two-part jokes (setup and delivery).\n    If the request fails or the response does not contain a joke, an error message is returned.\n    Returns:\n        str: The joke as a string, or an error message if the joke could not be fetched.\n    \"\"\"\n    url = \"https://v2.jokeapi.dev/joke/Any?type=single\"\n\n    try:\n        response = requests.get(url)\n        response.raise_for_status()\n\n        data = response.json()\n\n        if \"joke\" in data:\n            return data[\"joke\"]\n        elif \"setup\" in data and \"delivery\" in data:\n            return f\"{data['setup']} - {data['delivery']}\"\n        else:\n            return \"Error: Unable to fetch joke.\"\n\n    except requests.exceptions.RequestException as e:\n        return f\"Error fetching joke: {str(e)}\"\n\n\n@tool\ndef get_time_in_timezone(location: str) -> str:\n    \"\"\"\n    Fetches the current time for a given location using the World Time API.\n    Args:\n        location: The location for which to fetch the current time, formatted as 'Region/City'.\n    Returns:\n        str: A string indicating the current time in the specified location, or an error message if the request fails.\n    Raises:\n        requests.exceptions.RequestException: If there is an issue with the HTTP request.\n    \"\"\"\n    url = f\"http://worldtimeapi.org/api/timezone/{location}.json\"\n\n    try:\n        response = requests.get(url)\n        response.raise_for_status()\n\n        data = response.json()\n        current_time = data[\"datetime\"]\n\n        return f\"The current time in {location} is {current_time}.\"\n\n    except requests.exceptions.RequestException as e:\n        return f\"Error fetching time data: {str(e)}\"\n\n\n@tool\ndef get_random_fact() -> str:\n    \"\"\"\n    Fetches a random fact from the \"uselessfacts.jsph.pl\" API.\n    Returns:\n        str: A string containing the random fact or an error message if the request fails.\n    \"\"\"\n    url = \"https://uselessfacts.jsph.pl/random.json?language=en\"\n\n    try:\n        response = requests.get(url)\n        response.raise_for_status()\n\n        data = response.json()\n\n        return f\"Random Fact: {data['text']}\"\n\n    except requests.exceptions.RequestException as e:\n        return f\"Error fetching random fact: {str(e)}\"\n\n\n@tool\ndef search_wikipedia(query: str) -> str:\n    \"\"\"\n    Fetches a summary of a Wikipedia page for a given query.\n    Args:\n        query: The search term to look up on Wikipedia.\n    Returns:\n        str: A summary of the Wikipedia page if successful, or an error message if the request fails.\n    Raises:\n        requests.exceptions.RequestException: If there is an issue with the HTTP request.\n    \"\"\"\n    url = f\"https://en.wikipedia.org/api/rest_v1/page/summary/{query}\"\n\n    try:\n        response = requests.get(url)\n        response.raise_for_status()\n\n        data = response.json()\n        title = data[\"title\"]\n        extract = data[\"extract\"]\n\n        return f\"Summary for {title}: {extract}\"\n\n    except requests.exceptions.RequestException as e:\n        return f\"Error fetching Wikipedia data: {str(e)}\"\n\n\n# If you want to use the ToolCallingAgent instead, uncomment the following lines as they both will work\n\n# agent = ToolCallingAgent(\n#     tools=[\n#         convert_currency,\n#         get_weather,\n#         get_news_headlines,\n#         get_joke,\n#         get_random_fact,\n#         search_wikipedia,\n#     ],\n#     model=model,\n# )\n\n\nagent = CodeAgent(\n    tools=[\n        convert_currency,\n        get_weather,\n        get_news_headlines,\n        get_joke,\n        get_random_fact,\n        search_wikipedia,\n    ],\n    model=model,\n    stream_outputs=True,\n)\n\n# Uncomment the line below to run the agent with a specific query\n\nagent.run(\"Convert 5000 dollars to Euros\")\n# agent.run(\"What is the weather in New York?\")\n# agent.run(\"Give me the top news headlines\")\n# agent.run(\"Tell me a joke\")\n# agent.run(\"Tell me a Random Fact\")\n# agent.run(\"who is Elon Musk?\")\n"
  },
  {
    "path": "examples/open_deep_research/README.md",
    "content": "# Open Deep Research\n\nWelcome to this open replication of [OpenAI's Deep Research](https://openai.com/index/introducing-deep-research/)! This agent attempts to replicate OpenAI's model and achieve similar performance on research tasks.\n\nRead more about this implementation's goal and methods in our [blog post](https://huggingface.co/blog/open-deep-research).\n\n\nThis agent achieves **55% pass@1** on the GAIA validation set, compared to **67%** for the original Deep Research.\n\n## Setup\n\nTo get started, follow the steps below:\n\n### Clone the repository\n\n```bash\ngit clone https://github.com/huggingface/smolagents.git\ncd smolagents/examples/open_deep_research\n```\n\n### Install dependencies\n\nRun the following command to install the required dependencies from the `requirements.txt` file:\n\n```bash\npip install -r requirements.txt\n```\n\n### Install the development version of `smolagents`\n\n```bash\npip install -e ../../.[dev]\n```\n\n### Set up environment variables\n\nThe agent uses the `GoogleSearchTool` for web search, which requires an environment variable with the corresponding API key, based on the selected provider:\n- `SERPAPI_API_KEY` for SerpApi: [Sign up here to get a key](https://serpapi.com/users/sign_up)\n- `SERPER_API_KEY` for Serper: [Sign up here to get a key](https://serper.dev/signup)\n\nDepending on the model you want to use, you may need to set environment variables.\nFor example, to use the default `o1` model, you need to set the `OPENAI_API_KEY` environment variable.\n[Sign up here to get a key](https://platform.openai.com/signup).\n\n> [!WARNING]\n> The use of the default `o1` model is restricted to tier-3 access: https://help.openai.com/en/articles/10362446-api-access-to-o1-and-o3-mini\n\n\n## Usage\n\nThen you're good to go! Run the run.py script, as in:\n```bash\npython run.py --model-id \"o1\" \"Your question here!\"\n```\n\n## Full reproducibility of results\n\nThe data used in our submissions to GAIA was augmented in this way:\n -  For each single-page .pdf or .xls file, it was opened in a file reader (MacOS Sonoma Numbers or Preview), and a \".png\" screenshot was taken and added to the folder.\n- Then for any file used in a question, the file loading system checks if there is a \".png\" extension version of the file, and loads it instead of the original if it exists.\n\nThis process was done manually but could be automatized.\n\nAfter processing, the annotated was uploaded to a [new dataset](https://huggingface.co/datasets/smolagents/GAIA-annotated). You need to request access (granted instantly)."
  },
  {
    "path": "examples/open_deep_research/analysis.ipynb",
    "content": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"!pip install plotly kaleido datasets nbformat -U -q\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import os\\n\",\n    \"\\n\",\n    \"import datasets\\n\",\n    \"import pandas as pd\\n\",\n    \"from dotenv import load_dotenv\\n\",\n    \"from huggingface_hub import login\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"load_dotenv(override=True)\\n\",\n    \"login(os.getenv(\\\"HF_TOKEN\\\"))\\n\",\n    \"\\n\",\n    \"pd.set_option(\\\"max_colwidth\\\", None)\\n\",\n    \"\\n\",\n    \"OUTPUT_DIR = \\\"output\\\"\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"eval_ds = datasets.load_dataset(\\\"gaia-benchmark/GAIA\\\", \\\"2023_all\\\")[\\\"validation\\\"]\\n\",\n    \"eval_ds = eval_ds.rename_columns({\\\"Question\\\": \\\"question\\\", \\\"Final answer\\\": \\\"true_answer\\\", \\\"Level\\\": \\\"task\\\"})\\n\",\n    \"eval_df = pd.DataFrame(eval_ds)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"# 1. Load all results\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 88,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import glob\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"results = []\\n\",\n    \"for f in glob.glob(f\\\"{OUTPUT_DIR}/validation/*.jsonl\\\"):\\n\",\n    \"    df = pd.read_json(f, lines=True)\\n\",\n    \"    df[\\\"agent_name\\\"] = f.split(\\\"/\\\")[-1].split(\\\".\\\")[0]\\n\",\n    \"    results.append(df)\\n\",\n    \"\\n\",\n    \"result_df = pd.concat(results)\\n\",\n    \"result_df[\\\"prediction\\\"] = result_df[\\\"prediction\\\"].fillna(\\\"No prediction\\\")\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import re\\n\",\n    \"from collections import Counter\\n\",\n    \"\\n\",\n    \"from scripts.gaia_scorer import check_close_call, question_scorer\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"result_df[\\\"is_correct\\\"] = result_df.apply(lambda x: question_scorer(x[\\\"prediction\\\"], x[\\\"true_answer\\\"]), axis=1)\\n\",\n    \"result_df[\\\"is_near_correct\\\"] = result_df.apply(\\n\",\n    \"    lambda x: check_close_call(x[\\\"prediction\\\"], x[\\\"true_answer\\\"], x[\\\"is_correct\\\"]),\\n\",\n    \"    axis=1,\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"result_df[\\\"count_steps\\\"] = result_df[\\\"intermediate_steps\\\"].apply(len)\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def find_attachment(question):\\n\",\n    \"    matches = eval_df.loc[eval_df[\\\"question\\\"].apply(lambda x: x in question), \\\"file_name\\\"]\\n\",\n    \"\\n\",\n    \"    if len(matches) == 0:\\n\",\n    \"        return \\\"Not found\\\"\\n\",\n    \"    file_path = matches.values[0]\\n\",\n    \"\\n\",\n    \"    if isinstance(file_path, str) and len(file_path) > 0:\\n\",\n    \"        return file_path.split(\\\".\\\")[-1]\\n\",\n    \"    else:\\n\",\n    \"        return \\\"None\\\"\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"result_df[\\\"attachment_type\\\"] = result_df[\\\"question\\\"].apply(find_attachment)\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def extract_tool_calls(code):\\n\",\n    \"    regex = r\\\"\\\\b(\\\\w+)\\\\(\\\"\\n\",\n    \"    function_calls = [el for el in re.findall(regex, code) if el.islower()]\\n\",\n    \"\\n\",\n    \"    function_call_counter = Counter(function_calls)\\n\",\n    \"    return function_call_counter\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def sum_tool_calls(steps):\\n\",\n    \"    total_count = Counter()\\n\",\n    \"    for step in steps:\\n\",\n    \"        if \\\"llm_output\\\" in step:\\n\",\n    \"            total_count += extract_tool_calls(step[\\\"llm_output\\\"])\\n\",\n    \"\\n\",\n    \"    return total_count\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def get_durations(row):\\n\",\n    \"    # start_datetime = datetime.strptime(row['start_time'], \\\"%Y-%m-%d %H:%M:%S\\\")\\n\",\n    \"    # end_datetime = datetime.strptime(row['end_time'], \\\"%Y-%m-%d %H:%M:%S\\\")\\n\",\n    \"\\n\",\n    \"    duration_timedelta = row[\\\"end_time\\\"] - row[\\\"start_time\\\"]\\n\",\n    \"    return int(duration_timedelta.total_seconds())\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"result_df[\\\"duration\\\"] = result_df.apply(get_durations, axis=1)\\n\",\n    \"# result_df[\\\"tool_calls\\\"] = result_df[\\\"intermediate_steps\\\"].apply(sum_tool_calls)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"result_df[\\\"agent_name\\\"].value_counts()\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"# 2. Inspect specific runs\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"sel_df = result_df\\n\",\n    \"# sel_df = sel_df.loc[\\n\",\n    \"#     (result_df[\\\"agent_name\\\"].isin(list_versions))\\n\",\n    \"# ]\\n\",\n    \"sel_df = sel_df.reset_index(drop=True)\\n\",\n    \"display(sel_df[\\\"agent_name\\\"].value_counts())\\n\",\n    \"sel_df = sel_df.drop_duplicates(subset=[\\\"agent_name\\\", \\\"question\\\"])\\n\",\n    \"display(sel_df.groupby(\\\"agent_name\\\")[[\\\"task\\\"]].value_counts())\\n\",\n    \"print(\\\"Total length:\\\", len(sel_df), \\\"- is complete:\\\", len(sel_df) == 165)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"display(\\\"Average score:\\\", sel_df.groupby(\\\"agent_name\\\")[[\\\"is_correct\\\"]].mean().round(3))\\n\",\n    \"display(\\n\",\n    \"    sel_df.groupby([\\\"agent_name\\\", \\\"task\\\"])[[\\\"is_correct\\\", \\\"is_near_correct\\\", \\\"count_steps\\\", \\\"question\\\", \\\"duration\\\"]]\\n\",\n    \"    .agg(\\n\",\n    \"        {\\n\",\n    \"            \\\"is_correct\\\": \\\"mean\\\",\\n\",\n    \"            \\\"is_near_correct\\\": \\\"mean\\\",\\n\",\n    \"            \\\"count_steps\\\": \\\"mean\\\",\\n\",\n    \"            \\\"question\\\": \\\"count\\\",\\n\",\n    \"            \\\"duration\\\": \\\"mean\\\",\\n\",\n    \"        }\\n\",\n    \"    )\\n\",\n    \"    .rename(columns={\\\"question\\\": \\\"count\\\"})\\n\",\n    \")\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import plotly.express as px\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"cumulative_df = (\\n\",\n    \"    (\\n\",\n    \"        sel_df.groupby(\\\"agent_name\\\")[[\\\"is_correct\\\", \\\"is_near_correct\\\"]]\\n\",\n    \"        .expanding(min_periods=1, axis=0, method=\\\"single\\\")\\n\",\n    \"        .agg({\\\"is_correct\\\": \\\"mean\\\", \\\"is_near_correct\\\": \\\"count\\\"})\\n\",\n    \"        .reset_index()\\n\",\n    \"    )\\n\",\n    \"    .copy()\\n\",\n    \"    .rename(columns={\\\"is_near_correct\\\": \\\"index\\\"})\\n\",\n    \")\\n\",\n    \"cumulative_df[\\\"index\\\"] = cumulative_df[\\\"index\\\"].astype(int) - 1\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def find_question(row):\\n\",\n    \"    try:\\n\",\n    \"        res = sel_df.loc[sel_df[\\\"agent_name\\\"] == row[\\\"agent_name\\\"], \\\"question\\\"].iloc[row[\\\"index\\\"]][:50]\\n\",\n    \"        return res\\n\",\n    \"    except Exception:\\n\",\n    \"        return \\\"\\\"\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"cumulative_df[\\\"question\\\"] = cumulative_df.apply(find_question, axis=1)\\n\",\n    \"\\n\",\n    \"px.line(\\n\",\n    \"    cumulative_df,\\n\",\n    \"    color=\\\"agent_name\\\",\\n\",\n    \"    x=\\\"index\\\",\\n\",\n    \"    y=\\\"is_correct\\\",\\n\",\n    \"    hover_data=\\\"question\\\",\\n\",\n    \")\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"# 3. Dive deeper into one run\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"sel_df = result_df.loc[result_df[\\\"agent_name\\\"] == \\\"o1\\\"]\\n\",\n    \"print(len(sel_df))\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Count errors\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import numpy as np\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"error_types = [\\n\",\n    \"    \\\"AgentParsingError\\\",\\n\",\n    \"    \\\"AgentExecutionError\\\",\\n\",\n    \"    \\\"AgentMaxIterationsError\\\",\\n\",\n    \"    \\\"AgentGenerationError\\\",\\n\",\n    \"]\\n\",\n    \"sel_df[error_types] = 0\\n\",\n    \"sel_df[\\\"Count steps\\\"] = np.nan\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def count_errors(row):\\n\",\n    \"    if isinstance(row[\\\"intermediate_steps\\\"], list):\\n\",\n    \"        row[\\\"Count steps\\\"] = len(row[\\\"intermediate_steps\\\"])\\n\",\n    \"        for step in row[\\\"intermediate_steps\\\"]:\\n\",\n    \"            if isinstance(step, dict) and \\\"error\\\" in step:\\n\",\n    \"                try:\\n\",\n    \"                    row[str(step[\\\"error\\\"][\\\"error_type\\\"])] += 1\\n\",\n    \"                except Exception:\\n\",\n    \"                    pass\\n\",\n    \"    return row\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"sel_df = sel_df.apply(count_errors, axis=1)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import plotly.express as px\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"aggregate_errors = (\\n\",\n    \"    sel_df.groupby([\\\"is_correct\\\"])[error_types + [\\\"Count steps\\\"]].mean().reset_index().melt(id_vars=[\\\"is_correct\\\"])\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"fig = px.bar(\\n\",\n    \"    aggregate_errors,\\n\",\n    \"    y=\\\"value\\\",\\n\",\n    \"    x=\\\"variable\\\",\\n\",\n    \"    color=\\\"is_correct\\\",\\n\",\n    \"    labels={\\n\",\n    \"        \\\"agent_name\\\": \\\"<b>Model</b>\\\",\\n\",\n    \"        \\\"task\\\": \\\"<b>Level</b>\\\",\\n\",\n    \"        \\\"aggregate_score\\\": \\\"<b>Performance</b>\\\",\\n\",\n    \"        \\\"value\\\": \\\"<b>Average count</b>\\\",\\n\",\n    \"        \\\"eval_score_GPT4\\\": \\\"<b>Score</b>\\\",\\n\",\n    \"    },\\n\",\n    \")\\n\",\n    \"fig.update_layout(\\n\",\n    \"    height=500,\\n\",\n    \"    width=800,\\n\",\n    \"    barmode=\\\"group\\\",\\n\",\n    \"    bargroupgap=0.0,\\n\",\n    \")\\n\",\n    \"fig.update_traces(textposition=\\\"outside\\\")\\n\",\n    \"fig.write_image(\\\"aggregate_errors.png\\\", scale=3)\\n\",\n    \"fig.show()\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Inspect result by file extension type\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"display(\\n\",\n    \"    result_df.groupby([\\\"attachment_type\\\"])[[\\\"is_correct\\\", \\\"count_steps\\\", \\\"question\\\"]].agg(\\n\",\n    \"        {\\\"is_correct\\\": \\\"mean\\\", \\\"count_steps\\\": \\\"mean\\\", \\\"question\\\": \\\"count\\\"}\\n\",\n    \"    )\\n\",\n    \")\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"# 4. Ensembling methods\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"counts = result_df[\\\"agent_name\\\"].value_counts()\\n\",\n    \"long_series = result_df.loc[result_df[\\\"agent_name\\\"].isin(counts[counts > 140].index)]\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"def majority_vote(df):\\n\",\n    \"    df = df[(df[\\\"prediction\\\"] != \\\"Unable to determine\\\") & (~df[\\\"prediction\\\"].isna()) & (df[\\\"prediction\\\"] != \\\"None\\\")]\\n\",\n    \"\\n\",\n    \"    answer_modes = df.groupby(\\\"question\\\")[\\\"prediction\\\"].agg(lambda x: x.mode()[0]).reset_index()\\n\",\n    \"    first_occurrences = (\\n\",\n    \"        df.groupby([\\\"question\\\", \\\"prediction\\\"]).agg({\\\"task\\\": \\\"first\\\", \\\"is_correct\\\": \\\"first\\\"}).reset_index()\\n\",\n    \"    )\\n\",\n    \"    result = answer_modes.merge(first_occurrences, on=[\\\"question\\\", \\\"prediction\\\"], how=\\\"left\\\")\\n\",\n    \"\\n\",\n    \"    return result\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def oracle(df):\\n\",\n    \"    def get_first_correct_or_first_wrong(group):\\n\",\n    \"        correct_answers = group[group[\\\"is_correct\\\"]]\\n\",\n    \"        if len(correct_answers) > 0:\\n\",\n    \"            return correct_answers.iloc[0]\\n\",\n    \"        return group.iloc[0]\\n\",\n    \"\\n\",\n    \"    result = df.groupby(\\\"question\\\").apply(get_first_correct_or_first_wrong)\\n\",\n    \"\\n\",\n    \"    return result.reset_index(drop=True)\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"display((long_series.groupby(\\\"agent_name\\\")[\\\"is_correct\\\"].mean() * 100).round(2))\\n\",\n    \"print(f\\\"Majority score: {majority_vote(long_series)['is_correct'].mean() * 100:.2f}\\\")\\n\",\n    \"print(f\\\"Oracle score: {oracle(long_series)['is_correct'].mean() * 100:.2f}\\\")\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Submit\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"agent_run = \\\"code_o1_04_february_submission5.jsonl\\\"\\n\",\n    \"df = pd.read_json(f\\\"output/validation/{agent_run}\\\", lines=True)\\n\",\n    \"df = df[[\\\"task_id\\\", \\\"prediction\\\", \\\"intermediate_steps\\\"]]\\n\",\n    \"df = df.rename(columns={\\\"prediction\\\": \\\"model_answer\\\", \\\"intermediate_steps\\\": \\\"reasoning_trace\\\"})\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"df.to_json(\\\"submission.jsonl\\\", orient=\\\"records\\\", lines=True)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": []\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"agents\",\n   \"language\": \"python\",\n   \"name\": \"python3\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.12.0\"\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 2\n}\n"
  },
  {
    "path": "examples/open_deep_research/app.py",
    "content": "from run import create_agent\n\nfrom smolagents.gradio_ui import GradioUI\n\n\nagent = create_agent()\n\ndemo = GradioUI(agent)\n\nif __name__ == \"__main__\":\n    demo.launch()\n"
  },
  {
    "path": "examples/open_deep_research/requirements.txt",
    "content": "anthropic>=0.37.1\naudioop-lts<1.0; python_version >= \"3.13\" # required to use pydub in Python >=3.13; LTS port of the removed Python builtin module audioop\nbeautifulsoup4>=4.12.3\ndatasets>=2.21.0\ngoogle_search_results>=2.4.2\nhuggingface_hub>=0.23.4\nmammoth>=1.8.0\nmarkdownify>=0.13.1\nnumexpr>=2.10.1\nnumpy>=2.1.2\nopenai>=1.52.2\nopenpyxl\npandas>=2.2.3\npathvalidate>=3.2.1\npdfminer>=20191125\npdfminer.six>=20240706\nPillow>=11.0.0\npuremagic>=1.28\npypdf>=5.1.0\npython-dotenv>=1.0.1\npython_pptx>=1.0.2\nRequests>=2.32.3\ntqdm>=4.66.4\ntorch>=2.2.2\ntorchvision>=0.17.2\ntransformers>=4.46.0\nyoutube_transcript_api>=0.6.2\nchess\nsympy\npubchempy\nBio\nscikit-learn\nscipy\npydub\nPyPDF2\npython-pptx\ntorch\nxlrd\nSpeechRecognition\n"
  },
  {
    "path": "examples/open_deep_research/run.py",
    "content": "import argparse\nimport os\nimport threading\n\nfrom dotenv import load_dotenv\nfrom huggingface_hub import login\nfrom scripts.text_inspector_tool import TextInspectorTool\nfrom scripts.text_web_browser import (\n    ArchiveSearchTool,\n    FinderTool,\n    FindNextTool,\n    PageDownTool,\n    PageUpTool,\n    SimpleTextBrowser,\n    VisitTool,\n)\nfrom scripts.visual_qa import visualizer\n\nfrom smolagents import (\n    CodeAgent,\n    GoogleSearchTool,\n    # InferenceClientModel,\n    LiteLLMModel,\n    ToolCallingAgent,\n)\n\n\nload_dotenv(override=True)\nlogin(os.getenv(\"HF_TOKEN\"))\n\nappend_answer_lock = threading.Lock()\n\n\ndef parse_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"question\", type=str, help=\"for example: 'How many studio albums did Mercedes Sosa release before 2007?'\"\n    )\n    parser.add_argument(\"--model-id\", type=str, default=\"o1\")\n    return parser.parse_args()\n\n\ncustom_role_conversions = {\"tool-call\": \"assistant\", \"tool-response\": \"user\"}\n\nuser_agent = \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36 Edg/119.0.0.0\"\n\nBROWSER_CONFIG = {\n    \"viewport_size\": 1024 * 5,\n    \"downloads_folder\": \"downloads_folder\",\n    \"request_kwargs\": {\n        \"headers\": {\"User-Agent\": user_agent},\n        \"timeout\": 300,\n    },\n    \"serpapi_key\": os.getenv(\"SERPAPI_API_KEY\"),\n}\n\nos.makedirs(f\"./{BROWSER_CONFIG['downloads_folder']}\", exist_ok=True)\n\n\ndef create_agent(model_id=\"o1\"):\n    model_params = {\n        \"model_id\": model_id,\n        \"custom_role_conversions\": custom_role_conversions,\n        \"max_completion_tokens\": 8192,\n    }\n    if model_id == \"o1\":\n        model_params[\"reasoning_effort\"] = \"high\"\n    model = LiteLLMModel(**model_params)\n\n    text_limit = 100000\n    browser = SimpleTextBrowser(**BROWSER_CONFIG)\n    WEB_TOOLS = [\n        GoogleSearchTool(provider=\"serper\"),\n        VisitTool(browser),\n        PageUpTool(browser),\n        PageDownTool(browser),\n        FinderTool(browser),\n        FindNextTool(browser),\n        ArchiveSearchTool(browser),\n        TextInspectorTool(model, text_limit),\n    ]\n    text_webbrowser_agent = ToolCallingAgent(\n        model=model,\n        tools=WEB_TOOLS,\n        max_steps=20,\n        verbosity_level=2,\n        planning_interval=4,\n        name=\"search_agent\",\n        description=\"\"\"A team member that will search the internet to answer your question.\n    Ask him for all your questions that require browsing the web.\n    Provide him as much context as possible, in particular if you need to search on a specific timeframe!\n    And don't hesitate to provide him with a complex search task, like finding a difference between two webpages.\n    Your request must be a real sentence, not a google search! Like \"Find me this information (...)\" rather than a few keywords.\n    \"\"\",\n        provide_run_summary=True,\n    )\n    text_webbrowser_agent.prompt_templates[\"managed_agent\"][\"task\"] += \"\"\"You can navigate to .txt online files.\n    If a non-html page is in another format, especially .pdf or a Youtube video, use tool 'inspect_file_as_text' to inspect it.\n    Additionally, if after some searching you find out that you need more information to answer the question, you can use `final_answer` with your request for clarification as argument to request for more information.\"\"\"\n\n    manager_agent = CodeAgent(\n        model=model,\n        tools=[visualizer, TextInspectorTool(model, text_limit)],\n        max_steps=12,\n        verbosity_level=2,\n        additional_authorized_imports=[\"*\"],\n        planning_interval=4,\n        managed_agents=[text_webbrowser_agent],\n    )\n\n    return manager_agent\n\n\ndef main():\n    args = parse_args()\n\n    agent = create_agent(model_id=args.model_id)\n\n    answer = agent.run(args.question)\n\n    print(f\"Got this answer: {answer}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "examples/open_deep_research/run_gaia.py",
    "content": "# EXAMPLE COMMAND: from folder examples/open_deep_research, run: python run_gaia.py --concurrency 32 --run-name generate-traces-03-apr-noplanning --model-id gpt-4o\nimport argparse\nimport json\nimport os\nimport threading\nfrom concurrent.futures import ThreadPoolExecutor, as_completed\nfrom datetime import datetime\nfrom pathlib import Path\nfrom typing import Any\n\nimport datasets\nimport pandas as pd\nfrom dotenv import load_dotenv\nfrom huggingface_hub import login, snapshot_download\nfrom scripts.reformulator import prepare_response\nfrom scripts.run_agents import (\n    get_single_file_description,\n    get_zip_description,\n)\nfrom scripts.text_inspector_tool import TextInspectorTool\nfrom scripts.text_web_browser import (\n    ArchiveSearchTool,\n    FinderTool,\n    FindNextTool,\n    PageDownTool,\n    PageUpTool,\n    SimpleTextBrowser,\n    VisitTool,\n)\nfrom scripts.visual_qa import visualizer\nfrom tqdm import tqdm\n\nfrom smolagents import (\n    CodeAgent,\n    GoogleSearchTool,\n    LiteLLMModel,\n    Model,\n    TokenUsage,\n    ToolCallingAgent,\n)\n\n\nload_dotenv(override=True)\nlogin(os.getenv(\"HF_TOKEN\"))\n\nappend_answer_lock = threading.Lock()\n\n\ndef parse_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--concurrency\", type=int, default=8)\n    parser.add_argument(\"--model-id\", type=str, default=\"o1\")\n    parser.add_argument(\"--run-name\", type=str, required=True)\n    parser.add_argument(\"--set-to-run\", type=str, default=\"validation\")\n    parser.add_argument(\"--use-open-models\", type=bool, default=False)\n    parser.add_argument(\"--use-raw-dataset\", action=\"store_true\")\n    return parser.parse_args()\n\n\n### IMPORTANT: EVALUATION SWITCHES\n\nprint(\"Make sure you deactivated any VPN like Tailscale, else some URLs will be blocked!\")\n\ncustom_role_conversions = {\"tool-call\": \"assistant\", \"tool-response\": \"user\"}\n\n\nuser_agent = \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36 Edg/119.0.0.0\"\n\nBROWSER_CONFIG = {\n    \"viewport_size\": 1024 * 5,\n    \"downloads_folder\": \"downloads_folder\",\n    \"request_kwargs\": {\n        \"headers\": {\"User-Agent\": user_agent},\n        \"timeout\": 300,\n    },\n    \"serpapi_key\": os.getenv(\"SERPAPI_API_KEY\"),\n}\n\nos.makedirs(f\"./{BROWSER_CONFIG['downloads_folder']}\", exist_ok=True)\n\n\ndef create_agent_team(model: Model, token_counts: TokenUsage):\n    text_limit = 100000\n    ti_tool = TextInspectorTool(model, text_limit)\n\n    browser = SimpleTextBrowser(**BROWSER_CONFIG)\n\n    WEB_TOOLS = [\n        GoogleSearchTool(provider=\"serper\"),\n        VisitTool(browser),\n        PageUpTool(browser),\n        PageDownTool(browser),\n        FinderTool(browser),\n        FindNextTool(browser),\n        ArchiveSearchTool(browser),\n        TextInspectorTool(model, text_limit),\n    ]\n\n    def increment_web_agent_token_counts(final_answer, memory_step, agent):\n        token_counts_web = agent.monitor.get_total_token_counts()\n        token_counts.input_tokens += token_counts_web[\"input\"]\n        token_counts.output_tokens += token_counts_web[\"output\"]\n        return True\n\n    text_webbrowser_agent = ToolCallingAgent(\n        model=model,\n        tools=WEB_TOOLS,\n        max_steps=20,\n        verbosity_level=2,\n        planning_interval=4,\n        name=\"search_agent\",\n        description=\"\"\"A team member that will search the internet to answer your question.\n    Ask him for all your questions that require browsing the web.\n    Provide him as much context as possible, in particular if you need to search on a specific timeframe!\n    And don't hesitate to provide him with a complex search task, like finding a difference between two webpages.\n    Your request must be a real sentence, not a google search! Like \"Find me this information (...)\" rather than a few keywords.\n    \"\"\",\n        provide_run_summary=True,\n        final_answer_checks=[increment_web_agent_token_counts],\n    )\n    text_webbrowser_agent.prompt_templates[\"managed_agent\"][\"task\"] += \"\"\"You can navigate to .txt online files.\n    If a non-html page is in another format, especially .pdf or a Youtube video, use tool 'inspect_file_as_text' to inspect it.\n    Additionally, if after some searching you find out that you need more information to answer the question, you can use `final_answer` with your request for clarification as argument to request for more information.\"\"\"\n\n    manager_agent = CodeAgent(\n        model=model,\n        tools=[visualizer, ti_tool],\n        max_steps=12,\n        verbosity_level=2,\n        additional_authorized_imports=[\"*\"],\n        planning_interval=4,\n        managed_agents=[text_webbrowser_agent],\n    )\n    return manager_agent\n\n\ndef load_gaia_dataset(use_raw_dataset: bool, set_to_run: str) -> datasets.Dataset:\n    if not os.path.exists(\"data/gaia\"):\n        if use_raw_dataset:\n            snapshot_download(\n                repo_id=\"gaia-benchmark/GAIA\",\n                repo_type=\"dataset\",\n                local_dir=\"data/gaia\",\n                ignore_patterns=[\".gitattributes\", \"README.md\"],\n            )\n        else:\n            # WARNING: this dataset is gated: make sure you visit the repo to require access.\n            snapshot_download(\n                repo_id=\"smolagents/GAIA-annotated\",\n                repo_type=\"dataset\",\n                local_dir=\"data/gaia\",\n                ignore_patterns=[\".gitattributes\", \"README.md\"],\n            )\n\n    def preprocess_file_paths(row):\n        if len(row[\"file_name\"]) > 0:\n            row[\"file_name\"] = f\"data/gaia/{set_to_run}/\" + row[\"file_name\"]\n        return row\n\n    eval_ds = datasets.load_dataset(\n        \"data/gaia/GAIA.py\",\n        name=\"2023_all\",\n        split=set_to_run,\n        # data_files={\"validation\": \"validation/metadata.jsonl\", \"test\": \"test/metadata.jsonl\"},\n    )\n\n    eval_ds = eval_ds.rename_columns({\"Question\": \"question\", \"Final answer\": \"true_answer\", \"Level\": \"task\"})\n    eval_ds = eval_ds.map(preprocess_file_paths)\n    return eval_ds\n\n\ndef append_answer(entry: dict, jsonl_file: str) -> None:\n    jsonl_path = Path(jsonl_file)\n    jsonl_path.parent.mkdir(parents=True, exist_ok=True)\n    with append_answer_lock, open(jsonl_file, \"a\", encoding=\"utf-8\") as fp:\n        fp.write(json.dumps(entry) + \"\\n\")\n    assert jsonl_path.exists(), \"File not found!\"\n    print(\"Answer exported to file:\", jsonl_path.resolve())\n\n\ndef answer_single_question(\n    example: dict, model_id: str, answers_file: str, visual_inspection_tool: TextInspectorTool\n) -> None:\n    model_params: dict[str, Any] = {\n        \"model_id\": model_id,\n        \"custom_role_conversions\": custom_role_conversions,\n    }\n    if model_id == \"o1\":\n        model_params[\"reasoning_effort\"] = \"high\"\n        model_params[\"max_completion_tokens\"] = 8192\n    else:\n        model_params[\"max_tokens\"] = 4096\n    model = LiteLLMModel(**model_params)\n    # model = InferenceClientModel(model_id=\"Qwen/Qwen3-32B\", provider=\"novita\", max_tokens=4096)\n    document_inspection_tool = TextInspectorTool(model, 100000)\n\n    total_token_counts: TokenUsage = {\n        \"input\": 0,\n        \"output\": 0,\n    }\n    agent = create_agent_team(model, total_token_counts)\n\n    augmented_question = \"\"\"You have one question to answer. It is paramount that you provide a correct answer.\nGive it all you can: I know for a fact that you have access to all the relevant tools to solve it and find the correct answer (the answer does exist).\nFailure or 'I cannot answer' or 'None found' will not be tolerated, success will be rewarded.\nRun verification steps if that's needed, you must make sure you find the correct answer! Here is the task:\n\n\"\"\" + example[\"question\"]\n\n    if example[\"file_name\"]:\n        if \".zip\" in example[\"file_name\"]:\n            prompt_use_files = \"\\n\\nTo solve the task above, you will have to use these attached files:\\n\"\n            prompt_use_files += get_zip_description(\n                example[\"file_name\"], example[\"question\"], visual_inspection_tool, document_inspection_tool\n            )\n        else:\n            prompt_use_files = \"\\n\\nTo solve the task above, you will have to use this attached file:\\n\"\n            prompt_use_files += get_single_file_description(\n                example[\"file_name\"], example[\"question\"], visual_inspection_tool, document_inspection_tool\n            )\n        augmented_question += prompt_use_files\n\n    start_time = datetime.now().strftime(\"%Y-%m-%d %H:%M:%S\")\n    try:\n        # Run agent 🚀\n        final_result = agent.run(augmented_question)\n\n        agent_memory = agent.write_memory_to_messages()\n\n        final_result = prepare_response(augmented_question, agent_memory, reformulation_model=model)\n\n        output = str(final_result)\n        for memory_step in agent.memory.steps:\n            memory_step.model_input_messages = None\n        intermediate_steps = agent_memory\n\n        # Check for parsing errors which indicate the LLM failed to follow the required format\n        parsing_error = True if any([\"AgentParsingError\" in step for step in intermediate_steps]) else False\n\n        # check if iteration limit exceeded\n        iteration_limit_exceeded = True if \"Agent stopped due to iteration limit or time limit.\" in output else False\n        raised_exception = False\n\n    except Exception as e:\n        print(\"Error on \", augmented_question, e)\n        output = None\n        intermediate_steps = []\n        parsing_error = False\n        iteration_limit_exceeded = False\n        exception = e\n        raised_exception = True\n    end_time = datetime.now().strftime(\"%Y-%m-%d %H:%M:%S\")\n    token_counts_manager = agent.monitor.get_total_token_counts()\n    total_token_counts.input_tokens += token_counts_manager[\"input\"]\n    total_token_counts.output_tokens += token_counts_manager[\"output\"]\n    annotated_example = {\n        \"agent_name\": model.model_id,\n        \"question\": example[\"question\"],\n        \"augmented_question\": augmented_question,\n        \"prediction\": output,\n        \"intermediate_steps\": intermediate_steps,\n        \"parsing_error\": parsing_error,\n        \"iteration_limit_exceeded\": iteration_limit_exceeded,\n        \"agent_error\": str(exception) if raised_exception else None,\n        \"task\": example[\"task\"],\n        \"task_id\": example[\"task_id\"],\n        \"true_answer\": example[\"true_answer\"],\n        \"start_time\": start_time,\n        \"end_time\": end_time,\n        \"token_counts\": total_token_counts,\n    }\n    append_answer(annotated_example, answers_file)\n\n\ndef get_examples_to_answer(answers_file: str, eval_ds: datasets.Dataset) -> list[dict]:\n    print(f\"Loading answers from {answers_file}...\")\n    try:\n        done_questions = pd.read_json(answers_file, lines=True)[\"question\"].tolist()\n        print(f\"Found {len(done_questions)} previous results!\")\n    except Exception as e:\n        print(\"Error when loading records: \", e)\n        print(\"No usable records! ▶️ Starting new.\")\n        done_questions = []\n    return [line for line in eval_ds.to_list() if line[\"question\"] not in done_questions and line[\"file_name\"]]\n\n\ndef main():\n    args = parse_args()\n    print(f\"Starting run with arguments: {args}\")\n\n    eval_ds = load_gaia_dataset(args.use_raw_dataset, args.set_to_run)\n    print(\"Loaded evaluation dataset:\")\n    print(pd.DataFrame(eval_ds)[\"task\"].value_counts())\n\n    answers_file = f\"output/{args.set_to_run}/{args.run_name}.jsonl\"\n    tasks_to_run = get_examples_to_answer(answers_file, eval_ds)\n\n    with ThreadPoolExecutor(max_workers=args.concurrency) as exe:\n        futures = [\n            exe.submit(answer_single_question, example, args.model_id, answers_file, visualizer)\n            for example in tasks_to_run\n        ]\n        for f in tqdm(as_completed(futures), total=len(tasks_to_run), desc=\"Processing tasks\"):\n            f.result()\n\n    # for example in tasks_to_run:\n    #     answer_single_question(example, args.model_id, answers_file, visualizer)\n    print(\"All tasks processed.\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "examples/open_deep_research/scripts/cookies.py",
    "content": "from requests.cookies import RequestsCookieJar\n\n\nCOOKIES_LIST = [\n    {\n        \"domain\": \".youtube.com\",\n        \"expirationDate\": 1718884961,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"ST-xuwub9\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"session_logininfo=AFmmF2swRAIgf4gadACOuWOcipI1anW-dakEjtidNLkufnOC8uml7EECIDh2YisqWELDBJPTGUysCucJ3I0wjXxYjVHro1LHrdW0%3AQUQ3MjNmd2Jiajl3OWZYRnpFNnZlWWV5ZGJWZ0hpcmp4LVVPU280bk4zOS03Z0ozZG9fOFhWZ0dXaVo3NG1wTEg1b3hGaG10TFBlaFBnTlJfbER5bEp0aFhoNS1OLVhYNFRZT2F6ajgzOFpDbGhlUjZpMWRETlFFRjFfTTRiM0RnNTROSkdmMTFMVjFic1VuZ2trbGp4aktDa0JJUC1BWDh3\",\n    },\n    {\n        \"domain\": \".youtube.com\",\n        \"expirationDate\": 1753004444.745411,\n        \"hostOnly\": False,\n        \"httpOnly\": True,\n        \"name\": \"__Secure-YEC\",\n        \"path\": \"/\",\n        \"sameSite\": \"lax\",\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"CgtRVnI5LW1zRHlQVSjbtNCzBjIhCgJGUhIbEhcSFRMLFBUWFwwYGRobHB0eHw4PIBAREiAk\",\n    },\n    {\n        \"domain\": \".youtube.com\",\n        \"expirationDate\": 1753434620.050824,\n        \"hostOnly\": False,\n        \"httpOnly\": True,\n        \"name\": \"__Secure-3PSID\",\n        \"path\": \"/\",\n        \"sameSite\": \"no_restriction\",\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"g.a000kwibeLUu8Ea9Y-vLun7u3kU5VNJVuMAZl_jdfJaNm50JyDBB4ezJ_bdWu46a7YwObVn44wACgYKAakSARQSFQHGX2MicJcTzecTKH6bHzqU6TMbTxoVAUF8yKqQYK-MoI6Ql3vI2oYTB3E-0076\",\n    },\n    {\n        \"domain\": \".youtube.com\",\n        \"expirationDate\": 1750420959.974642,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"SIDCC\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"AKEyXzWQZauHKOo8t87zoEcjaVNIYUX54ohoWXT-tX4aAhEuZzIIptxZAcNkHuG2oDXYL6t-lw\",\n    },\n    {\n        \"domain\": \".youtube.com\",\n        \"expirationDate\": 1753434620.050652,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"SID\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"g.a000kwibeLUu8Ea9Y-vLun7u3kU5VNJVuMAZl_jdfJaNm50JyDBB6VHrZcC3gBAsFPbCQ0gF5AACgYKAYkSARQSFQHGX2Mi9kt0gHg5CxCYSkLQGHWaeBoVAUF8yKre_V6r3jZVak6JV4o2Q0FL0076\",\n    },\n    {\n        \"domain\": \".youtube.com\",\n        \"expirationDate\": 1750420958.397534,\n        \"hostOnly\": False,\n        \"httpOnly\": True,\n        \"name\": \"__Secure-1PSIDTS\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"sidts-CjIB3EgAEkYL2L-GfrEzW5Dfy62S9oefGNLgst78S_986htCnGcfkxECch_9oz-qytSsZBAA\",\n    },\n    {\n        \"domain\": \".youtube.com\",\n        \"expirationDate\": 1753433494.44729,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"_ga_M0180HEFCY\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"GS1.1.1718871908.1.0.1718873494.0.0.0\",\n    },\n    {\n        \"domain\": \".youtube.com\",\n        \"expirationDate\": 1753434620.050933,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"SAPISID\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"mfeuiC-HraNJ-A03/ASXvCPNJSw7yTFgd6\",\n    },\n    {\n        \"domain\": \".youtube.com\",\n        \"expirationDate\": 1750420959.974764,\n        \"hostOnly\": False,\n        \"httpOnly\": True,\n        \"name\": \"__Secure-1PSIDCC\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"AKEyXzWHDSoXGCZpZhPxRrnC7B1s8zGIUjeMVyvgtQfsm1fs92lXPtFEI_td9LBUyqVUe0xK\",\n    },\n    {\n        \"domain\": \".youtube.com\",\n        \"expirationDate\": 1753434620.050881,\n        \"hostOnly\": False,\n        \"httpOnly\": True,\n        \"name\": \"SSID\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"AmlwXHnQvOQ10LVd-\",\n    },\n    {\n        \"domain\": \".youtube.com\",\n        \"expirationDate\": 1753434620.050959,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"__Secure-1PAPISID\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"mfeuiC-HraNJ-A03/ASXvCPNJSw7yTFgd6\",\n    },\n    {\n        \"domain\": \".youtube.com\",\n        \"expirationDate\": 1753434620.050795,\n        \"hostOnly\": False,\n        \"httpOnly\": True,\n        \"name\": \"__Secure-1PSID\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"g.a000kwibeLUu8Ea9Y-vLun7u3kU5VNJVuMAZl_jdfJaNm50JyDBBrlk7lRpKQGywAHEon7WGQAACgYKAQsSARQSFQHGX2MirAmnSRdZl6GPG6KLd4hOihoVAUF8yKoV17Tcj1a_OenIOkf2wBjO0076\",\n    },\n    {\n        \"domain\": \".youtube.com\",\n        \"expirationDate\": 1753434620.050993,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"__Secure-3PAPISID\",\n        \"path\": \"/\",\n        \"sameSite\": \"no_restriction\",\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"mfeuiC-HraNJ-A03/ASXvCPNJSw7yTFgd6\",\n    },\n    {\n        \"domain\": \".youtube.com\",\n        \"expirationDate\": 1750420959.974815,\n        \"hostOnly\": False,\n        \"httpOnly\": True,\n        \"name\": \"__Secure-3PSIDCC\",\n        \"path\": \"/\",\n        \"sameSite\": \"no_restriction\",\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"AKEyXzXM5UjKUEXwSHVmRAIo6hGHA4G63adj3EE1VdNriD0f38jZQbsUKiD4LQbA3BValmTFDg\",\n    },\n    {\n        \"domain\": \".youtube.com\",\n        \"expirationDate\": 1750420958.397647,\n        \"hostOnly\": False,\n        \"httpOnly\": True,\n        \"name\": \"__Secure-3PSIDTS\",\n        \"path\": \"/\",\n        \"sameSite\": \"no_restriction\",\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"sidts-CjIB3EgAEkYL2L-GfrEzW5Dfy62S9oefGNLgst78S_986htCnGcfkxECch_9oz-qytSsZBAA\",\n    },\n    {\n        \"domain\": \".youtube.com\",\n        \"expirationDate\": 1753434620.050908,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"APISID\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"IlQWLPjdNqziwCrV/ANG7Z4x5FF-IBxbZk\",\n    },\n    {\n        \"domain\": \".youtube.com\",\n        \"expirationDate\": 1753434620.050855,\n        \"hostOnly\": False,\n        \"httpOnly\": True,\n        \"name\": \"HSID\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"AasA7hmRuTFv7vjoq\",\n    },\n    {\n        \"domain\": \".youtube.com\",\n        \"expirationDate\": 1753435873.577793,\n        \"hostOnly\": False,\n        \"httpOnly\": True,\n        \"name\": \"LOGIN_INFO\",\n        \"path\": \"/\",\n        \"sameSite\": \"no_restriction\",\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"AFmmF2swRAIgf4gadACOuWOcipI1anW-dakEjtidNLkufnOC8uml7EECIDh2YisqWELDBJPTGUysCucJ3I0wjXxYjVHro1LHrdW0:QUQ3MjNmd2Jiajl3OWZYRnpFNnZlWWV5ZGJWZ0hpcmp4LVVPU280bk4zOS03Z0ozZG9fOFhWZ0dXaVo3NG1wTEg1b3hGaG10TFBlaFBnTlJfbER5bEp0aFhoNS1OLVhYNFRZT2F6ajgzOFpDbGhlUjZpMWRETlFFRjFfTTRiM0RnNTROSkdmMTFMVjFic1VuZ2trbGp4aktDa0JJUC1BWDh3\",\n    },\n    {\n        \"domain\": \".youtube.com\",\n        \"expirationDate\": 1753444956.555608,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"PREF\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"f4=4000000&f6=40000000&tz=Europe.Paris&f5=30000&f7=100\",\n    },\n]\n\nCOOKIES_LIST += [\n    {\n        \"domain\": \".www.researchgate.net\",\n        \"hostOnly\": False,\n        \"httpOnly\": True,\n        \"name\": \"isInstIp\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": True,\n        \"session\": True,\n        \"storeId\": None,\n        \"value\": \"False\",\n    },\n    {\n        \"domain\": \".researchgate.net\",\n        \"expirationDate\": 1734423981,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"__eoi\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"ID=c26f752377373146:T=1718871981:RT=1718884914:S=AA-AfjZw-T_OOX2kW2LLaFzXImgc\",\n    },\n    {\n        \"domain\": \".www.researchgate.net\",\n        \"expirationDate\": 1753444909.646103,\n        \"hostOnly\": False,\n        \"httpOnly\": True,\n        \"name\": \"ptc\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"RG1.8947708639250500550.1718872043\",\n    },\n    {\n        \"domain\": \".researchgate.net\",\n        \"expirationDate\": 1750507578,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"euconsent-v2-didomi\",\n        \"path\": \"/\",\n        \"sameSite\": \"lax\",\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"CQAgmoAQAgmoAAHABBENA5EsAP_gAEPgAAYgJ2pB5G5UTWlBIG53YMskIAUFhFBoQEAgAACAAwIBSBIAIIwEAGAAIAgAICACAAIAIBIAIABAGAAAAAAAYIAAIAAIAAAQIAAKIAAAAAAAAgBQAAgIAgggEAAAgEBEABAAgAAAEIIAQNgACgAAACCAAAAAAAABAAAAAAAAQAAAAAAAYCQAAAJIAAAAACAIABAIAAAAAAAAAAAAAAAABBAAIJ2wPIAFAAXABQAFQALgAcAA8ACAAEgALwAZAA0ACIAEcAJgAUgAqgBcADEAGgAPQAfgBEACOAE4AMMAZYA0QBsgDkAHOAO4AfsBBwEIAItARwBHQC6gHUAO2Ae0A_4CHQEXgJ2AUOAo8BT4CpQFqALYAXmAwQBkgDLAGXANjAhCBG8CbAE3gJ1gTtAA.f_wACHwAAAAA\",\n    },\n    {\n        \"domain\": \".researchgate.net\",\n        \"expirationDate\": 1718885236,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"_gat\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"1\",\n    },\n    {\n        \"domain\": \"www.researchgate.net\",\n        \"expirationDate\": 1721477183,\n        \"hostOnly\": True,\n        \"httpOnly\": False,\n        \"name\": \"_pbjs_userid_consent_data\",\n        \"path\": \"/\",\n        \"sameSite\": \"lax\",\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"3524755945110770\",\n    },\n    {\n        \"domain\": \".researchgate.net\",\n        \"expirationDate\": 1752567981,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"__gads\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"ID=eca2adb88969c830:T=1718871981:RT=1718884914:S=ALNI_MY2qZchynrhWX6hWMlaI87Pcj9riQ\",\n    },\n    {\n        \"domain\": \".researchgate.net\",\n        \"expirationDate\": 1718886709.646173,\n        \"hostOnly\": False,\n        \"httpOnly\": True,\n        \"name\": \"__cf_bm\",\n        \"path\": \"/\",\n        \"sameSite\": \"no_restriction\",\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"IkQ_J4ciBzKQduRvjqsfSmQu8UygDWbHeROO5JVccfo-1718884909-1.0.1.1-qvNGEdbfI0HfhFP6kwe7R7mkTqODNhFuKhs72lLly6K2BOPMG3kbahpQFGvPK0U8FUfkznkq65gngd1sWj7sDA\",\n    },\n    {\n        \"domain\": \".researchgate.net\",\n        \"expirationDate\": 1752567981,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"__gpi\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"UID=00000e4e9aa2e6f2:T=1718871981:RT=1718884914:S=ALNI_MYFNrgzkKn7K6Bd2y8hC6GJCvDiSg\",\n    },\n    {\n        \"domain\": \".researchgate.net\",\n        \"hostOnly\": False,\n        \"httpOnly\": True,\n        \"name\": \"_cfuvid\",\n        \"path\": \"/\",\n        \"sameSite\": \"no_restriction\",\n        \"secure\": True,\n        \"session\": True,\n        \"storeId\": None,\n        \"value\": \"_GPmGZkBymiH3UiqTqzakEpi98br3nfFUWC2_u_wqkc-1718884909785-0.0.1.1-604800000\",\n    },\n    {\n        \"domain\": \".researchgate.net\",\n        \"expirationDate\": 1753445177.271667,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"_ga\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"GA1.1.1525244793.1718885177\",\n    },\n    {\n        \"domain\": \".researchgate.net\",\n        \"expirationDate\": 1753445177.271482,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"_ga_4P31SJ70EJ\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"GS1.1.1718885177.1.0.1718885177.0.0.0\",\n    },\n    {\n        \"domain\": \".researchgate.net\",\n        \"expirationDate\": 1718971576,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"_gid\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"GA1.2.854907463.1718885177\",\n    },\n    {\n        \"domain\": \".www.researchgate.net\",\n        \"expirationDate\": 1750407982.506505,\n        \"hostOnly\": False,\n        \"httpOnly\": True,\n        \"name\": \"did\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"1dWLO3C6am8l667Q4VUlBo0O1LI49Qi2Vw21SJEXHavBDYT56DI9007W5rYGVFVH\",\n    },\n    {\n        \"domain\": \".researchgate.net\",\n        \"expirationDate\": 1750507578,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"didomi_token\",\n        \"path\": \"/\",\n        \"sameSite\": \"lax\",\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"eyJ1c2VyX2lkIjoiMTkwMzU4YTUtNWU2My02Y2UzLWJlNzAtZGFjNzVmYjdiY2ExIiwiY3JlYXRlZCI6IjIwMjQtMDYtMjBUMTI6MDY6MTYuODA2WiIsInVwZGF0ZWQiOiIyMDI0LTA2LTIwVDEyOjA2OjE4Ljc4MVoiLCJ2ZW5kb3JzIjp7ImVuYWJsZWQiOlsidHdpdHRlciIsImdvb2dsZSIsImM6bGlua2VkaW4tbWFya2V0aW5nLXNvbHV0aW9ucyIsImM6b3duZXJpcSIsImM6b21uaXR1cmUtYWRvYmUtYW5hbHl0aWNzIiwiYzp0ZWNobm9yYXRpLW1lZGlhIiwiYzppbnRlcmNvbSIsImM6aW50ZW50LWlxIiwiYzppcHJvbSIsImM6bGlua2VkaW4iLCJjOmFtYXpvbmFkdi16Y1hGTEI2WCIsImM6bWVkaWFuZXQtY1V3YUtFNnoiLCJjOmluZGV4ZXhjaC1OWkNRTTY4UCIsImM6emVvdGFwZ21iLWQ3YndtdGp3IiwiYzp0cmlwbGVsaWYtZGRKSDM0clkiLCJjOnJ0YmhvdXNlLWI4Y2RIOHRNIiwiYzptZHByaW1pcy1lYU4yOVdjUCIsImM6bG9vcG1lbGktVGRhWXRCUHEiLCJjOm1hZ25pdGVpbi05d1RZTHFSRCIsImM6Ymlkc3dpdGNoLWQ2N0V3N1c5IiwiYzpvcmFjbGVhZHYtcUhlREptQUwiLCJjOmdvb2dsZWFuYS00VFhuSmlnUiIsImM6bG90YW1lc29sLURIaTdMUmpNIiwiYzpuZXh0bWlsbGUtR0pyZlg4VWMiLCJjOm5yaWNodGVjLXFVVlEyUlFxIiwiYzpicml0ZXBvb2wtQldWeVdHeVUiLCJjOnRhcGFkaW5jLXFxY2tVN1BXIiwiYzppZDV0ZWNobi16Tk1KNGR3ZiIsImM6bWljcm9zb2Z0IiwiYzpwZXJtdXRpdmUtSjdpaHJlTWsiLCJjOm9wZXJhc29mdC1CY1hjRFZKTSIsImM6cG9zdGhvZy1Cakp4RmRGOSJdfSwicHVycG9zZXMiOnsiZW5hYmxlZCI6WyJnZW9sb2NhdGlvbl9kYXRhIiwiZGV2aWNlX2NoYXJhY3RlcmlzdGljcyJdfSwidmVuZG9yc19saSI6eyJlbmFibGVkIjpbImdvb2dsZSIsImM6b3BlcmFzb2Z0LUJjWGNEVkpNIl19LCJ2ZXJzaW9uIjoyLCJhYyI6IkRIU0FvQUZrQWNnQTVnSHFnUUhBeGdCNndEMTRJR0FRTkFqMEJJd0NTY0VyQUtCd1YtZ3MxQmgwREc0R09nQUEuREhTQW9BRmtBY2dBNWdIcWdRSEF4Z0I2d0QxNElHQVFOQWowQkl3Q1NjRXJBS0J3Vi1nczFCaDBERzRHT2dBQSJ9\",\n    },\n    {\n        \"domain\": \".www.researchgate.net\",\n        \"hostOnly\": False,\n        \"httpOnly\": True,\n        \"name\": \"hasPdpNext\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": True,\n        \"session\": True,\n        \"storeId\": None,\n        \"value\": \"False\",\n    },\n    {\n        \"domain\": \".researchgate.net\",\n        \"expirationDate\": 1750421183,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"ph_phc_ma1XTQyee96N1GML6qUTgLQRiDifnRcE9STiHTZ0CfZ_posthog\",\n        \"path\": \"/\",\n        \"sameSite\": \"lax\",\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"%7B%22distinct_id%22%3A%220190358a-56a1-7313-83b0-d13dddeac787%22%2C%22%24sesid%22%3A%5B1718885183223%2C%220190358a-56a1-7313-83b0-d13b2b87778d%22%2C1718885176993%5D%2C%22%24session_is_sampled%22%3Atrue%7D\",\n    },\n    {\n        \"domain\": \".www.researchgate.net\",\n        \"hostOnly\": False,\n        \"httpOnly\": True,\n        \"name\": \"sid\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": True,\n        \"session\": True,\n        \"storeId\": None,\n        \"value\": \"qmH5Lc4f0CUJ3zeaxORcV0S8I8V1MuCFZtcIQqPYtv1XPejrbSLAQRbT50PL40TqeKQ1XsQDWt9gtYVzuL80bRmPjw6jn3cQ0ikNqW40maHcQ3JL2Vfa8ZZf0j7p35eJ\",\n    },\n]\n\nCOOKIES_LIST += [\n    {\n        \"domain\": \"github.com\",\n        \"hostOnly\": True,\n        \"httpOnly\": True,\n        \"name\": \"_gh_sess\",\n        \"path\": \"/\",\n        \"sameSite\": \"lax\",\n        \"secure\": True,\n        \"session\": True,\n        \"storeId\": None,\n        \"value\": \"P%2Fmof1avuqwHaUQUIJR%2FZYn7jqbT7lgGuTGjp1BGAFIG5UpNDusEE3b8dRjz0eATE5xPdPjLYFqMs%2FI9AOalKX4YuYfSEEnxCMawU01099b4o9Xzzcv%2BmecrmO0Q8q%2Bdq1h8SIv6nvPP7HzlFesl8ysafb9b%2F0q6dTArKdSOurasza8UgLSYD08ofA50Pcm0IG7CTzF8ZCizrGgGTMi%2F%2B7L3E17jav5PM1Sf2vQKg15Gbg1QIOppJJHzlufgQoZigqFv%2BWznaws0Tt7Y2lSFCw%3D%3D--CJRhqMXJnwOaJgk4--DhUErlL4GdROikEjKD4O9g%3D%3D\",\n    },\n    {\n        \"domain\": \".github.com\",\n        \"expirationDate\": 1750408875.763785,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"_octo\",\n        \"path\": \"/\",\n        \"sameSite\": \"lax\",\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"GH1.1.728652011.1718872875\",\n    },\n    {\n        \"domain\": \".github.com\",\n        \"expirationDate\": 1750408875.763926,\n        \"hostOnly\": False,\n        \"httpOnly\": True,\n        \"name\": \"logged_in\",\n        \"path\": \"/\",\n        \"sameSite\": \"lax\",\n        \"secure\": True,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"no\",\n    },\n    {\n        \"domain\": \".github.com\",\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"preferred_color_mode\",\n        \"path\": \"/\",\n        \"sameSite\": \"lax\",\n        \"secure\": True,\n        \"session\": True,\n        \"storeId\": None,\n        \"value\": \"dark\",\n    },\n    {\n        \"domain\": \".github.com\",\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"tz\",\n        \"path\": \"/\",\n        \"sameSite\": \"lax\",\n        \"secure\": True,\n        \"session\": True,\n        \"storeId\": None,\n        \"value\": \"Europe%2FParis\",\n    },\n]\n\nCOOKIES_LIST += [\n    {\n        \"domain\": \".web.archive.org\",\n        \"expirationDate\": 1718886430,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"_gat\",\n        \"path\": \"/web/20201123221659/http://orcid.org/\",\n        \"sameSite\": None,\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"1\",\n    },\n    {\n        \"domain\": \".web.archive.org\",\n        \"expirationDate\": 1718972770,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"_gid\",\n        \"path\": \"/web/20201123221659/http://orcid.org/\",\n        \"sameSite\": None,\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"GA1.2.402246368.1606169825\",\n    },\n    {\n        \"domain\": \".web.archive.org\",\n        \"expirationDate\": 1753446370.315621,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"_ga\",\n        \"path\": \"/web/20201123221659/http://orcid.org/\",\n        \"sameSite\": None,\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"GA1.2.1301409987.1606169825\",\n    },\n    {\n        \"domain\": \".web.archive.org\",\n        \"expirationDate\": 1750422367,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"_hjid\",\n        \"path\": \"/web/20201123221659/http://orcid.org/\",\n        \"sameSite\": \"lax\",\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"07f80263-a631-4bf4-8ffd-8fc8912085e2\",\n    },\n    {\n        \"domain\": \".web.archive.org\",\n        \"expirationDate\": 1718888167,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"_hjFirstSeen\",\n        \"path\": \"/web/20201123221659/http://orcid.org/\",\n        \"sameSite\": \"lax\",\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"1\",\n    },\n]\nCOOKIES_LIST += [\n    {\n        \"domain\": \"orcid.org\",\n        \"hostOnly\": True,\n        \"httpOnly\": False,\n        \"name\": \"AWSELBCORS\",\n        \"path\": \"/\",\n        \"sameSite\": \"no_restriction\",\n        \"secure\": True,\n        \"session\": True,\n        \"storeId\": None,\n        \"value\": \"CBD1D7FF1216388FA48838CBCA4774FD22800B8FB548A40EF92BB0994D5B77A8410307CDEAA69C52236663F2BF89B252C17BC0FCDF790FD59771BDDF6EA8CA4CFD29D8733F\",\n    },\n    {\n        \"domain\": \".orcid.org\",\n        \"expirationDate\": 1753452454.637671,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"_ga_9R61FWK9H5\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"GS1.1.1718892454.1.0.1718892454.0.0.0\",\n    },\n    {\n        \"domain\": \".orcid.org\",\n        \"expirationDate\": 1753452454.63421,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"_ga\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"GA1.1.2021310691.1718892455\",\n    },\n    {\n        \"domain\": \"orcid.org\",\n        \"hostOnly\": True,\n        \"httpOnly\": False,\n        \"name\": \"AWSELB\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": False,\n        \"session\": True,\n        \"storeId\": None,\n        \"value\": \"CBD1D7FF1216388FA48838CBCA4774FD22800B8FB548A40EF92BB0994D5B77A8410307CDEAA69C52236663F2BF89B252C17BC0FCDF790FD59771BDDF6EA8CA4CFD29D8733F\",\n    },\n    {\n        \"domain\": \".orcid.org\",\n        \"expirationDate\": 1750428454,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"OptanonAlertBoxClosed\",\n        \"path\": \"/\",\n        \"sameSite\": \"lax\",\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"2024-06-20T14:07:34.583Z\",\n    },\n    {\n        \"domain\": \".orcid.org\",\n        \"expirationDate\": 1750428454,\n        \"hostOnly\": False,\n        \"httpOnly\": False,\n        \"name\": \"OptanonConsent\",\n        \"path\": \"/\",\n        \"sameSite\": \"lax\",\n        \"secure\": False,\n        \"session\": False,\n        \"storeId\": None,\n        \"value\": \"isGpcEnabled=0&datestamp=Thu+Jun+20+2024+16%3A07%3A34+GMT%2B0200+(heure+d%E2%80%99%C3%A9t%C3%A9+d%E2%80%99Europe+centrale)&version=202310.2.0&browserGpcFlag=0&isIABGlobal=False&hosts=&landingPath=NotLandingPage&groups=C0001%3A1%2CC0003%3A1%2CC0002%3A1%2CC0004%3A1\",\n    },\n    {\n        \"domain\": \"orcid.org\",\n        \"hostOnly\": True,\n        \"httpOnly\": False,\n        \"name\": \"XSRF-TOKEN\",\n        \"path\": \"/\",\n        \"sameSite\": None,\n        \"secure\": True,\n        \"session\": True,\n        \"storeId\": None,\n        \"value\": \"6957be7a-bcb4-4d59-a522-ea9b6b210ed9\",\n    },\n]\n\n# Create a RequestsCookieJar instance\nCOOKIES = RequestsCookieJar()\n\n# Add cookies to the jar\nfor cookie in COOKIES_LIST:\n    COOKIES.set(cookie[\"name\"], cookie[\"value\"], domain=cookie[\"domain\"], path=cookie[\"path\"])\n"
  },
  {
    "path": "examples/open_deep_research/scripts/gaia_scorer.py",
    "content": "import re\nimport string\nimport warnings\n\n\ndef normalize_number_str(number_str: str) -> float:\n    # we replace these common units and commas to allow\n    # conversion to float\n    for char in [\"$\", \"%\", \",\"]:\n        number_str = number_str.replace(char, \"\")\n    try:\n        return float(number_str)\n    except ValueError:\n        print(f\"String {number_str} cannot be normalized to number str.\")\n        return float(\"inf\")\n\n\ndef split_string(\n    s: str,\n    char_list: list[str] = [\",\", \";\"],\n) -> list[str]:\n    pattern = f\"[{''.join(char_list)}]\"\n    return re.split(pattern, s)\n\n\ndef is_float(element: any) -> bool:\n    try:\n        float(element)\n        return True\n    except ValueError:\n        return False\n\n\ndef question_scorer(\n    model_answer: str,\n    ground_truth: str,\n) -> bool:\n    # if gt is a number\n    if is_float(ground_truth):\n        normalized_answer = normalize_number_str(str(model_answer))\n        return normalized_answer == float(ground_truth)\n\n    # if gt is a list\n    elif any(char in ground_truth for char in [\",\", \";\"]):\n        # question with the fish: normalization removes punct\n\n        gt_elems = split_string(ground_truth)\n        ma_elems = split_string(model_answer)\n\n        # check length is the same\n        if len(gt_elems) != len(ma_elems):\n            warnings.warn(\"Answer lists have different lengths, returning False.\", UserWarning)\n            return False\n\n        # compare each element as float or str\n        comparisons = []\n        for ma_elem, gt_elem in zip(ma_elems, gt_elems):\n            if is_float(gt_elem):\n                normalized_ma_elem = normalize_number_str(ma_elem)\n                comparisons.append(normalized_ma_elem == float(gt_elem))\n            else:\n                # we do not remove punct since comparisons can include punct\n                comparisons.append(\n                    normalize_str(ma_elem, remove_punct=False) == normalize_str(gt_elem, remove_punct=False)\n                )\n        return all(comparisons)\n\n    # if gt is a str\n    else:\n        return normalize_str(model_answer) == normalize_str(ground_truth)\n\n\ndef check_prediction_contains_answer_letters_in_order(prediction, true_answer):\n    prediction = prediction.lower()\n    true_answer = true_answer.lower()\n    if len(prediction) > len(true_answer) * 3:\n        return False\n    i = 0\n    for letter in true_answer:\n        if letter in prediction[i:]:\n            i += prediction[i:].index(letter)\n        else:\n            return False\n    return True\n\n\ndef check_close_call(prediction, true_answer, is_correct):\n    if is_correct:\n        return True\n    else:\n        if is_float(true_answer):\n            return is_correct\n        else:\n            if (\n                check_prediction_contains_answer_letters_in_order(str(prediction), str(true_answer))\n                and len(str(true_answer)) * 0.5 <= len(str(prediction)) <= len(str(true_answer)) * 2\n            ):\n                print(f\"Close call: {prediction} vs {true_answer}\")\n                return True\n            else:\n                return False\n\n\ndef normalize_str(input_str, remove_punct=True) -> str:\n    \"\"\"\n    Normalize a string by:\n    - Removing all white spaces\n    - Optionally removing punctuation (if remove_punct is True)\n    - Converting to lowercase\n    Parameters:\n    - input_str: str, the string to normalize\n    - remove_punct: bool, whether to remove punctuation (default: True)\n    Returns:\n    - str, the normalized string\n    \"\"\"\n    # Remove all white spaces. Required e.g for seagull vs. sea gull\n    no_spaces = re.sub(r\"\\s\", \"\", input_str)\n\n    # Remove punctuation, if specified.\n    if remove_punct:\n        translator = str.maketrans(\"\", \"\", string.punctuation)\n        return no_spaces.lower().translate(translator)\n    else:\n        return no_spaces.lower()\n"
  },
  {
    "path": "examples/open_deep_research/scripts/mdconvert.py",
    "content": "# This is copied from Magentic-one's great repo: https://github.com/microsoft/autogen/blob/v0.4.4/python/packages/autogen-magentic-one/src/autogen_magentic_one/markdown_browser/mdconvert.py\n# Thanks to Microsoft researchers for open-sourcing this!\n# type: ignore\nimport base64\nimport copy\nimport html\nimport json\nimport mimetypes\nimport os\nimport re\nimport shutil\nimport subprocess\nimport sys\nimport tempfile\nimport traceback\nimport zipfile\nfrom typing import Any\nfrom urllib.parse import parse_qs, quote, unquote, urlparse, urlunparse\n\nimport mammoth\nimport markdownify\nimport pandas as pd\nimport pdfminer\nimport pdfminer.high_level\nimport pptx\n\n# File-format detection\nimport puremagic\nimport pydub\nimport requests\nimport speech_recognition as sr\nfrom bs4 import BeautifulSoup\nfrom youtube_transcript_api import YouTubeTranscriptApi\nfrom youtube_transcript_api.formatters import SRTFormatter\n\n\nclass _CustomMarkdownify(markdownify.MarkdownConverter):\n    \"\"\"\n    A custom version of markdownify's MarkdownConverter. Changes include:\n\n    - Altering the default heading style to use '#', '##', etc.\n    - Removing javascript hyperlinks.\n    - Truncating images with large data:uri sources.\n    - Ensuring URIs are properly escaped, and do not conflict with Markdown syntax\n    \"\"\"\n\n    def __init__(self, **options: Any):\n        options[\"heading_style\"] = options.get(\"heading_style\", markdownify.ATX)\n        # Explicitly cast options to the expected type if necessary\n        super().__init__(**options)\n\n    def convert_hn(self, n: int, el: Any, text: str, convert_as_inline: bool) -> str:\n        \"\"\"Same as usual, but be sure to start with a new line\"\"\"\n        if not convert_as_inline:\n            if not re.search(r\"^\\n\", text):\n                return \"\\n\" + super().convert_hn(n, el, text, convert_as_inline)  # type: ignore\n\n        return super().convert_hn(n, el, text, convert_as_inline)  # type: ignore\n\n    def convert_a(self, el: Any, text: str, convert_as_inline: bool):\n        \"\"\"Same as usual converter, but removes Javascript links and escapes URIs.\"\"\"\n        prefix, suffix, text = markdownify.chomp(text)  # type: ignore\n        if not text:\n            return \"\"\n        href = el.get(\"href\")\n        title = el.get(\"title\")\n\n        # Escape URIs and skip non-http or file schemes\n        if href:\n            try:\n                parsed_url = urlparse(href)  # type: ignore\n                if parsed_url.scheme and parsed_url.scheme.lower() not in [\"http\", \"https\", \"file\"]:  # type: ignore\n                    return \"%s%s%s\" % (prefix, text, suffix)\n                href = urlunparse(parsed_url._replace(path=quote(unquote(parsed_url.path))))  # type: ignore\n            except ValueError:  # It's not clear if this ever gets thrown\n                return \"%s%s%s\" % (prefix, text, suffix)\n\n        # For the replacement see #29: text nodes underscores are escaped\n        if (\n            self.options[\"autolinks\"]\n            and text.replace(r\"\\_\", \"_\") == href\n            and not title\n            and not self.options[\"default_title\"]\n        ):\n            # Shortcut syntax\n            return \"<%s>\" % href\n        if self.options[\"default_title\"] and not title:\n            title = href\n        title_part = ' \"%s\"' % title.replace('\"', r\"\\\"\") if title else \"\"\n        return \"%s[%s](%s%s)%s\" % (prefix, text, href, title_part, suffix) if href else text\n\n    def convert_img(self, el: Any, text: str, convert_as_inline: bool) -> str:\n        \"\"\"Same as usual converter, but removes data URIs\"\"\"\n\n        alt = el.attrs.get(\"alt\", None) or \"\"\n        src = el.attrs.get(\"src\", None) or \"\"\n        title = el.attrs.get(\"title\", None) or \"\"\n        title_part = ' \"%s\"' % title.replace('\"', r\"\\\"\") if title else \"\"\n        if convert_as_inline and el.parent.name not in self.options[\"keep_inline_images_in\"]:\n            return alt\n\n        # Remove dataURIs\n        if src.startswith(\"data:\"):\n            src = src.split(\",\")[0] + \"...\"\n\n        return \"![%s](%s%s)\" % (alt, src, title_part)\n\n    def convert_soup(self, soup: Any) -> str:\n        return super().convert_soup(soup)  # type: ignore\n\n\nclass DocumentConverterResult:\n    \"\"\"The result of converting a document to text.\"\"\"\n\n    def __init__(self, title: str | None = None, text_content: str = \"\"):\n        self.title: str | None = title\n        self.text_content: str = text_content\n\n\nclass DocumentConverter:\n    \"\"\"Abstract superclass of all DocumentConverters.\"\"\"\n\n    def convert(self, local_path: str, **kwargs: Any) -> None | DocumentConverterResult:\n        raise NotImplementedError()\n\n\nclass PlainTextConverter(DocumentConverter):\n    \"\"\"Anything with content type text/plain\"\"\"\n\n    def convert(self, local_path: str, **kwargs: Any) -> None | DocumentConverterResult:\n        # Guess the content type from any file extension that might be around\n        content_type, _ = mimetypes.guess_type(\"__placeholder\" + kwargs.get(\"file_extension\", \"\"))\n\n        # Only accept text files\n        if content_type is None:\n            return None\n        # elif \"text/\" not in content_type.lower():\n        #     return None\n\n        text_content = \"\"\n        with open(local_path, \"rt\", encoding=\"utf-8\") as fh:\n            text_content = fh.read()\n        return DocumentConverterResult(\n            title=None,\n            text_content=text_content,\n        )\n\n\nclass HtmlConverter(DocumentConverter):\n    \"\"\"Anything with content type text/html\"\"\"\n\n    def convert(self, local_path: str, **kwargs: Any) -> None | DocumentConverterResult:\n        # Bail if not html\n        extension = kwargs.get(\"file_extension\", \"\")\n        if extension.lower() not in [\".html\", \".htm\"]:\n            return None\n\n        result = None\n        with open(local_path, \"rt\", encoding=\"utf-8\") as fh:\n            result = self._convert(fh.read())\n\n        return result\n\n    def _convert(self, html_content: str) -> None | DocumentConverterResult:\n        \"\"\"Helper function that converts and HTML string.\"\"\"\n\n        # Parse the string\n        soup = BeautifulSoup(html_content, \"html.parser\")\n\n        # Remove javascript and style blocks\n        for script in soup([\"script\", \"style\"]):\n            script.extract()\n\n        # Print only the main content\n        body_elm = soup.find(\"body\")\n        webpage_text = \"\"\n        if body_elm:\n            webpage_text = _CustomMarkdownify().convert_soup(body_elm)\n        else:\n            webpage_text = _CustomMarkdownify().convert_soup(soup)\n\n        assert isinstance(webpage_text, str)\n\n        return DocumentConverterResult(\n            title=None if soup.title is None else soup.title.string, text_content=webpage_text\n        )\n\n\nclass WikipediaConverter(DocumentConverter):\n    \"\"\"Handle Wikipedia pages separately, focusing only on the main document content.\"\"\"\n\n    def convert(self, local_path: str, **kwargs: Any) -> None | DocumentConverterResult:\n        # Bail if not Wikipedia\n        extension = kwargs.get(\"file_extension\", \"\")\n        if extension.lower() not in [\".html\", \".htm\"]:\n            return None\n        url = kwargs.get(\"url\", \"\")\n        if not re.search(r\"^https?:\\/\\/[a-zA-Z]{2,3}\\.wikipedia.org\\/\", url):\n            return None\n\n        # Parse the file\n        soup = None\n        with open(local_path, \"rt\", encoding=\"utf-8\") as fh:\n            soup = BeautifulSoup(fh.read(), \"html.parser\")\n\n        # Remove javascript and style blocks\n        for script in soup([\"script\", \"style\"]):\n            script.extract()\n\n        # Print only the main content\n        body_elm = soup.find(\"div\", {\"id\": \"mw-content-text\"})\n        title_elm = soup.find(\"span\", {\"class\": \"mw-page-title-main\"})\n\n        webpage_text = \"\"\n        main_title = None if soup.title is None else soup.title.string\n\n        if body_elm:\n            # What's the title\n            if title_elm and len(title_elm) > 0:\n                main_title = title_elm.string  # type: ignore\n                assert isinstance(main_title, str)\n\n            # Convert the page\n            webpage_text = f\"# {main_title}\\n\\n\" + _CustomMarkdownify().convert_soup(body_elm)\n        else:\n            webpage_text = _CustomMarkdownify().convert_soup(soup)\n\n        return DocumentConverterResult(\n            title=main_title,\n            text_content=webpage_text,\n        )\n\n\nclass YouTubeConverter(DocumentConverter):\n    \"\"\"Handle YouTube specially, focusing on the video title, description, and transcript.\"\"\"\n\n    def convert(self, local_path: str, **kwargs: Any) -> None | DocumentConverterResult:\n        # Bail if not YouTube\n        extension = kwargs.get(\"file_extension\", \"\")\n        if extension.lower() not in [\".html\", \".htm\"]:\n            return None\n        url = kwargs.get(\"url\", \"\")\n        if not url.startswith(\"https://www.youtube.com/watch?\"):\n            return None\n\n        # Parse the file\n        soup = None\n        with open(local_path, \"rt\", encoding=\"utf-8\") as fh:\n            soup = BeautifulSoup(fh.read(), \"html.parser\")\n\n        # Read the meta tags\n        assert soup.title is not None and soup.title.string is not None\n        metadata: dict[str, str] = {\"title\": soup.title.string}\n        for meta in soup([\"meta\"]):\n            for a in meta.attrs:\n                if a in [\"itemprop\", \"property\", \"name\"]:\n                    metadata[meta[a]] = meta.get(\"content\", \"\")\n                    break\n\n        # We can also try to read the full description. This is more prone to breaking, since it reaches into the page implementation\n        try:\n            for script in soup([\"script\"]):\n                content = script.text\n                if \"ytInitialData\" in content:\n                    lines = re.split(r\"\\r?\\n\", content)\n                    obj_start = lines[0].find(\"{\")\n                    obj_end = lines[0].rfind(\"}\")\n                    if obj_start >= 0 and obj_end >= 0:\n                        data = json.loads(lines[0][obj_start : obj_end + 1])\n                        attrdesc = self._findKey(data, \"attributedDescriptionBodyText\")  # type: ignore\n                        if attrdesc:\n                            metadata[\"description\"] = str(attrdesc[\"content\"])\n                    break\n        except Exception:\n            pass\n\n        # Start preparing the page\n        webpage_text = \"# YouTube\\n\"\n\n        title = self._get(metadata, [\"title\", \"og:title\", \"name\"])  # type: ignore\n        assert isinstance(title, str)\n\n        if title:\n            webpage_text += f\"\\n## {title}\\n\"\n\n        stats = \"\"\n        views = self._get(metadata, [\"interactionCount\"])  # type: ignore\n        if views:\n            stats += f\"- **Views:** {views}\\n\"\n\n        keywords = self._get(metadata, [\"keywords\"])  # type: ignore\n        if keywords:\n            stats += f\"- **Keywords:** {keywords}\\n\"\n\n        runtime = self._get(metadata, [\"duration\"])  # type: ignore\n        if runtime:\n            stats += f\"- **Runtime:** {runtime}\\n\"\n\n        if len(stats) > 0:\n            webpage_text += f\"\\n### Video Metadata\\n{stats}\\n\"\n\n        description = self._get(metadata, [\"description\", \"og:description\"])  # type: ignore\n        if description:\n            webpage_text += f\"\\n### Description\\n{description}\\n\"\n\n        transcript_text = \"\"\n        parsed_url = urlparse(url)  # type: ignore\n        params = parse_qs(parsed_url.query)  # type: ignore\n        if \"v\" in params:\n            assert isinstance(params[\"v\"][0], str)\n            video_id = str(params[\"v\"][0])\n            try:\n                # Must be a single transcript.\n                transcript = YouTubeTranscriptApi.get_transcript(video_id)  # type: ignore\n                # transcript_text = \" \".join([part[\"text\"] for part in transcript])  # type: ignore\n                # Alternative formatting:\n                transcript_text = SRTFormatter().format_transcript(transcript)\n            except Exception:\n                pass\n        if transcript_text:\n            webpage_text += f\"\\n### Transcript\\n{transcript_text}\\n\"\n\n        title = title if title else soup.title.string\n        assert isinstance(title, str)\n\n        return DocumentConverterResult(\n            title=title,\n            text_content=webpage_text,\n        )\n\n    def _get(self, metadata: dict[str, str], keys: list[str], default: str | None = None) -> str | None:\n        for k in keys:\n            if k in metadata:\n                return metadata[k]\n        return default\n\n    def _findKey(self, json: Any, key: str) -> str | None:  # TODO: Fix json type\n        if isinstance(json, list):\n            for elm in json:\n                ret = self._findKey(elm, key)\n                if ret is not None:\n                    return ret\n        elif isinstance(json, dict):\n            for k in json:\n                if k == key:\n                    return json[k]\n                else:\n                    ret = self._findKey(json[k], key)\n                    if ret is not None:\n                        return ret\n        return None\n\n\nclass PdfConverter(DocumentConverter):\n    \"\"\"\n    Converts PDFs to Markdown. Most style information is ignored, so the results are essentially plain-text.\n    \"\"\"\n\n    def convert(self, local_path, **kwargs) -> None | DocumentConverterResult:\n        # Bail if not a PDF\n        extension = kwargs.get(\"file_extension\", \"\")\n        if extension.lower() != \".pdf\":\n            return None\n\n        return DocumentConverterResult(\n            title=None,\n            text_content=pdfminer.high_level.extract_text(local_path),\n        )\n\n\nclass DocxConverter(HtmlConverter):\n    \"\"\"\n    Converts DOCX files to Markdown. Style information (e.g.m headings) and tables are preserved where possible.\n    \"\"\"\n\n    def convert(self, local_path, **kwargs) -> None | DocumentConverterResult:\n        # Bail if not a DOCX\n        extension = kwargs.get(\"file_extension\", \"\")\n        if extension.lower() != \".docx\":\n            return None\n\n        result = None\n        with open(local_path, \"rb\") as docx_file:\n            result = mammoth.convert_to_html(docx_file)\n            html_content = result.value\n            result = self._convert(html_content)\n\n        return result\n\n\nclass XlsxConverter(HtmlConverter):\n    \"\"\"\n    Converts XLSX files to Markdown, with each sheet presented as a separate Markdown table.\n    \"\"\"\n\n    def convert(self, local_path, **kwargs) -> None | DocumentConverterResult:\n        # Bail if not a XLSX\n        extension = kwargs.get(\"file_extension\", \"\")\n        if extension.lower() not in [\".xlsx\", \".xls\"]:\n            return None\n\n        sheets = pd.read_excel(local_path, sheet_name=None)\n        md_content = \"\"\n        for s in sheets:\n            md_content += f\"## {s}\\n\"\n            html_content = sheets[s].to_html(index=False)\n            md_content += self._convert(html_content).text_content.strip() + \"\\n\\n\"\n\n        return DocumentConverterResult(\n            title=None,\n            text_content=md_content.strip(),\n        )\n\n\nclass PptxConverter(HtmlConverter):\n    \"\"\"\n    Converts PPTX files to Markdown. Supports heading, tables and images with alt text.\n    \"\"\"\n\n    def convert(self, local_path, **kwargs) -> None | DocumentConverterResult:\n        # Bail if not a PPTX\n        extension = kwargs.get(\"file_extension\", \"\")\n        if extension.lower() != \".pptx\":\n            return None\n\n        md_content = \"\"\n\n        presentation = pptx.Presentation(local_path)\n        slide_num = 0\n        for slide in presentation.slides:\n            slide_num += 1\n\n            md_content += f\"\\n\\n<!-- Slide number: {slide_num} -->\\n\"\n\n            title = slide.shapes.title\n            for shape in slide.shapes:\n                # Pictures\n                if self._is_picture(shape):\n                    # https://github.com/scanny/python-pptx/pull/512#issuecomment-1713100069\n                    alt_text = \"\"\n                    try:\n                        alt_text = shape._element._nvXxPr.cNvPr.attrib.get(\"descr\", \"\")\n                    except Exception:\n                        pass\n\n                    # A placeholder name\n                    filename = re.sub(r\"\\W\", \"\", shape.name) + \".jpg\"\n                    md_content += \"\\n![\" + (alt_text if alt_text else shape.name) + \"](\" + filename + \")\\n\"\n\n                # Tables\n                if self._is_table(shape):\n                    html_table = \"<html><body><table>\"\n                    first_row = True\n                    for row in shape.table.rows:\n                        html_table += \"<tr>\"\n                        for cell in row.cells:\n                            if first_row:\n                                html_table += \"<th>\" + html.escape(cell.text) + \"</th>\"\n                            else:\n                                html_table += \"<td>\" + html.escape(cell.text) + \"</td>\"\n                        html_table += \"</tr>\"\n                        first_row = False\n                    html_table += \"</table></body></html>\"\n                    md_content += \"\\n\" + self._convert(html_table).text_content.strip() + \"\\n\"\n\n                # Text areas\n                elif shape.has_text_frame:\n                    if shape == title:\n                        md_content += \"# \" + shape.text.lstrip() + \"\\n\"\n                    else:\n                        md_content += shape.text + \"\\n\"\n\n            md_content = md_content.strip()\n\n            if slide.has_notes_slide:\n                md_content += \"\\n\\n### Notes:\\n\"\n                notes_frame = slide.notes_slide.notes_text_frame\n                if notes_frame is not None:\n                    md_content += notes_frame.text\n                md_content = md_content.strip()\n\n        return DocumentConverterResult(\n            title=None,\n            text_content=md_content.strip(),\n        )\n\n    def _is_picture(self, shape):\n        if shape.shape_type == pptx.enum.shapes.MSO_SHAPE_TYPE.PICTURE:\n            return True\n        if shape.shape_type == pptx.enum.shapes.MSO_SHAPE_TYPE.PLACEHOLDER:\n            if hasattr(shape, \"image\"):\n                return True\n        return False\n\n    def _is_table(self, shape):\n        if shape.shape_type == pptx.enum.shapes.MSO_SHAPE_TYPE.TABLE:\n            return True\n        return False\n\n\nclass MediaConverter(DocumentConverter):\n    \"\"\"\n    Abstract class for multi-modal media (e.g., images and audio)\n    \"\"\"\n\n    def _get_metadata(self, local_path):\n        exiftool = shutil.which(\"exiftool\")\n        if not exiftool:\n            return None\n        else:\n            try:\n                result = subprocess.run([exiftool, \"-json\", local_path], capture_output=True, text=True).stdout\n                return json.loads(result)[0]\n            except Exception:\n                return None\n\n\nclass WavConverter(MediaConverter):\n    \"\"\"\n    Converts WAV files to markdown via extraction of metadata (if `exiftool` is installed), and speech transcription (if `speech_recognition` is installed).\n    \"\"\"\n\n    def convert(self, local_path, **kwargs) -> None | DocumentConverterResult:\n        # Bail if not a XLSX\n        extension = kwargs.get(\"file_extension\", \"\")\n        if extension.lower() != \".wav\":\n            return None\n\n        md_content = \"\"\n\n        # Add metadata\n        metadata = self._get_metadata(local_path)\n        if metadata:\n            for f in [\n                \"Title\",\n                \"Artist\",\n                \"Author\",\n                \"Band\",\n                \"Album\",\n                \"Genre\",\n                \"Track\",\n                \"DateTimeOriginal\",\n                \"CreateDate\",\n                \"Duration\",\n            ]:\n                if f in metadata:\n                    md_content += f\"{f}: {metadata[f]}\\n\"\n\n        # Transcribe\n        try:\n            transcript = self._transcribe_audio(local_path)\n            md_content += \"\\n\\n### Audio Transcript:\\n\" + (\"[No speech detected]\" if transcript == \"\" else transcript)\n        except Exception:\n            md_content += \"\\n\\n### Audio Transcript:\\nError. Could not transcribe this audio.\"\n\n        return DocumentConverterResult(\n            title=None,\n            text_content=md_content.strip(),\n        )\n\n    def _transcribe_audio(self, local_path) -> str:\n        recognizer = sr.Recognizer()\n        with sr.AudioFile(local_path) as source:\n            audio = recognizer.record(source)\n            return recognizer.recognize_google(audio).strip()\n\n\nclass Mp3Converter(WavConverter):\n    \"\"\"\n    Converts MP3 and M4A files to markdown via extraction of metadata (if `exiftool` is installed), and speech transcription (if `speech_recognition` AND `pydub` are installed).\n    \"\"\"\n\n    def convert(self, local_path, **kwargs) -> None | DocumentConverterResult:\n        # Bail if not a MP3\n        extension = kwargs.get(\"file_extension\", \"\")\n        if extension.lower() not in [\".mp3\", \".m4a\"]:\n            return None\n\n        md_content = \"\"\n\n        # Add metadata\n        metadata = self._get_metadata(local_path)\n        if metadata:\n            for f in [\n                \"Title\",\n                \"Artist\",\n                \"Author\",\n                \"Band\",\n                \"Album\",\n                \"Genre\",\n                \"Track\",\n                \"DateTimeOriginal\",\n                \"CreateDate\",\n                \"Duration\",\n            ]:\n                if f in metadata:\n                    md_content += f\"{f}: {metadata[f]}\\n\"\n\n        # Transcribe\n        handle, temp_path = tempfile.mkstemp(suffix=\".wav\")\n        os.close(handle)\n        try:\n            if extension.lower() == \".mp3\":\n                sound = pydub.AudioSegment.from_mp3(local_path)\n            else:\n                sound = pydub.AudioSegment.from_file(local_path, format=\"m4a\")\n            sound.export(temp_path, format=\"wav\")\n\n            _args = dict()\n            _args.update(kwargs)\n            _args[\"file_extension\"] = \".wav\"\n\n            try:\n                transcript = super()._transcribe_audio(temp_path).strip()\n                md_content += \"\\n\\n### Audio Transcript:\\n\" + (\n                    \"[No speech detected]\" if transcript == \"\" else transcript\n                )\n            except Exception:\n                md_content += \"\\n\\n### Audio Transcript:\\nError. Could not transcribe this audio.\"\n\n        finally:\n            os.unlink(temp_path)\n\n        # Return the result\n        return DocumentConverterResult(\n            title=None,\n            text_content=md_content.strip(),\n        )\n\n\nclass ZipConverter(DocumentConverter):\n    \"\"\"\n    Extracts ZIP files to a permanent local directory and returns a listing of extracted files.\n    \"\"\"\n\n    def __init__(self, extract_dir: str = \"downloads\"):\n        \"\"\"\n        Initialize with path to extraction directory.\n\n        Args:\n            extract_dir: The directory where files will be extracted. Defaults to \"downloads\"\n        \"\"\"\n        self.extract_dir = extract_dir\n        # Create the extraction directory if it doesn't exist\n        os.makedirs(self.extract_dir, exist_ok=True)\n\n    def convert(self, local_path: str, **kwargs: Any) -> None | DocumentConverterResult:\n        # Bail if not a ZIP file\n        extension = kwargs.get(\"file_extension\", \"\")\n        if extension.lower() != \".zip\":\n            return None\n\n        # Verify it's actually a ZIP file\n        if not zipfile.is_zipfile(local_path):\n            return None\n\n        # Extract all files and build list\n        extracted_files = []\n        with zipfile.ZipFile(local_path, \"r\") as zip_ref:\n            # Extract all files\n            zip_ref.extractall(self.extract_dir)\n            # Get list of all files\n            for file_path in zip_ref.namelist():\n                # Skip directories\n                if not file_path.endswith(\"/\"):\n                    extracted_files.append(self.extract_dir + \"/\" + file_path)\n\n        # Sort files for consistent output\n        extracted_files.sort()\n\n        # Build the markdown content\n        md_content = \"Downloaded the following files:\\n\"\n        for file in extracted_files:\n            md_content += f\"* {file}\\n\"\n\n        return DocumentConverterResult(title=\"Extracted Files\", text_content=md_content.strip())\n\n\nclass ImageConverter(MediaConverter):\n    \"\"\"\n    Converts images to markdown via extraction of metadata (if `exiftool` is installed), OCR (if `easyocr` is installed), and description via a multimodal LLM (if an mlm_client is configured).\n    \"\"\"\n\n    def convert(self, local_path, **kwargs) -> None | DocumentConverterResult:\n        # Bail if not a XLSX\n        extension = kwargs.get(\"file_extension\", \"\")\n        if extension.lower() not in [\".jpg\", \".jpeg\", \".png\"]:\n            return None\n\n        md_content = \"\"\n\n        # Add metadata\n        metadata = self._get_metadata(local_path)\n        if metadata:\n            for f in [\n                \"ImageSize\",\n                \"Title\",\n                \"Caption\",\n                \"Description\",\n                \"Keywords\",\n                \"Artist\",\n                \"Author\",\n                \"DateTimeOriginal\",\n                \"CreateDate\",\n                \"GPSPosition\",\n            ]:\n                if f in metadata:\n                    md_content += f\"{f}: {metadata[f]}\\n\"\n\n        # Try describing the image with GPTV\n        mlm_client = kwargs.get(\"mlm_client\")\n        mlm_model = kwargs.get(\"mlm_model\")\n        if mlm_client is not None and mlm_model is not None:\n            md_content += (\n                \"\\n# Description:\\n\"\n                + self._get_mlm_description(\n                    local_path, extension, mlm_client, mlm_model, prompt=kwargs.get(\"mlm_prompt\")\n                ).strip()\n                + \"\\n\"\n            )\n\n        return DocumentConverterResult(\n            title=None,\n            text_content=md_content,\n        )\n\n    def _get_mlm_description(self, local_path, extension, client, model, prompt=None):\n        if prompt is None or prompt.strip() == \"\":\n            prompt = \"Write a detailed caption for this image.\"\n\n        sys.stderr.write(f\"MLM Prompt:\\n{prompt}\\n\")\n\n        data_uri = \"\"\n        with open(local_path, \"rb\") as image_file:\n            content_type, encoding = mimetypes.guess_type(\"_dummy\" + extension)\n            if content_type is None:\n                content_type = \"image/jpeg\"\n            image_base64 = base64.b64encode(image_file.read()).decode(\"utf-8\")\n            data_uri = f\"data:{content_type};base64,{image_base64}\"\n\n        messages = [\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\"type\": \"text\", \"text\": prompt},\n                    {\n                        \"type\": \"image_url\",\n                        \"image_url\": {\n                            \"url\": data_uri,\n                        },\n                    },\n                ],\n            }\n        ]\n\n        response = client.chat.completions.create(model=model, messages=messages)\n        return response.choices[0].message.content\n\n\nclass FileConversionException(Exception):\n    pass\n\n\nclass UnsupportedFormatException(Exception):\n    pass\n\n\nclass MarkdownConverter:\n    \"\"\"(In preview) An extremely simple text-based document reader, suitable for LLM use.\n    This reader will convert common file-types or webpages to Markdown.\"\"\"\n\n    def __init__(\n        self,\n        requests_session: requests.Session | None = None,\n        mlm_client: Any | None = None,\n        mlm_model: Any | None = None,\n    ):\n        if requests_session is None:\n            self._requests_session = requests.Session()\n        else:\n            self._requests_session = requests_session\n\n        self._mlm_client = mlm_client\n        self._mlm_model = mlm_model\n\n        self._page_converters: list[DocumentConverter] = []\n\n        # Register converters for successful browsing operations\n        # Later registrations are tried first / take higher priority than earlier registrations\n        # To this end, the most specific converters should appear below the most generic converters\n        self.register_page_converter(PlainTextConverter())\n        self.register_page_converter(HtmlConverter())\n        self.register_page_converter(WikipediaConverter())\n        self.register_page_converter(YouTubeConverter())\n        self.register_page_converter(DocxConverter())\n        self.register_page_converter(XlsxConverter())\n        self.register_page_converter(PptxConverter())\n        self.register_page_converter(WavConverter())\n        self.register_page_converter(Mp3Converter())\n        self.register_page_converter(ImageConverter())\n        self.register_page_converter(ZipConverter())\n        self.register_page_converter(PdfConverter())\n\n    def convert(\n        self, source: str | requests.Response, **kwargs: Any\n    ) -> DocumentConverterResult:  # TODO: deal with kwargs\n        \"\"\"\n        Args:\n            - source: can be a string representing a path or url, or a requests.response object\n            - extension: specifies the file extension to use when interpreting the file. If None, infer from source (path, uri, content-type, etc.)\n        \"\"\"\n\n        # Local path or url\n        if isinstance(source, str):\n            if source.startswith(\"http://\") or source.startswith(\"https://\") or source.startswith(\"file://\"):\n                return self.convert_url(source, **kwargs)\n            else:\n                return self.convert_local(source, **kwargs)\n        # Request response\n        elif isinstance(source, requests.Response):\n            return self.convert_response(source, **kwargs)\n\n    def convert_local(self, path: str, **kwargs: Any) -> DocumentConverterResult:  # TODO: deal with kwargs\n        # Prepare a list of extensions to try (in order of priority)\n        ext = kwargs.get(\"file_extension\")\n        extensions = [ext] if ext is not None else []\n\n        # Get extension alternatives from the path and puremagic\n        base, ext = os.path.splitext(path)\n        self._append_ext(extensions, ext)\n        self._append_ext(extensions, self._guess_ext_magic(path))\n\n        # Convert\n        return self._convert(path, extensions, **kwargs)\n\n    # TODO what should stream's type be?\n    def convert_stream(self, stream: Any, **kwargs: Any) -> DocumentConverterResult:  # TODO: deal with kwargs\n        # Prepare a list of extensions to try (in order of priority)\n        ext = kwargs.get(\"file_extension\")\n        extensions = [ext] if ext is not None else []\n\n        # Save the file locally to a temporary file. It will be deleted before this method exits\n        handle, temp_path = tempfile.mkstemp()\n        fh = os.fdopen(handle, \"wb\")\n        result = None\n        try:\n            # Write to the temporary file\n            content = stream.read()\n            if isinstance(content, str):\n                fh.write(content.encode(\"utf-8\"))\n            else:\n                fh.write(content)\n            fh.close()\n\n            # Use puremagic to check for more extension options\n            self._append_ext(extensions, self._guess_ext_magic(temp_path))\n\n            # Convert\n            result = self._convert(temp_path, extensions, **kwargs)\n        # Clean up\n        finally:\n            try:\n                fh.close()\n            except Exception:\n                pass\n            os.unlink(temp_path)\n\n        return result\n\n    def convert_url(self, url: str, **kwargs: Any) -> DocumentConverterResult:  # TODO: fix kwargs type\n        # Send a HTTP request to the URL\n        user_agent = \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36 Edg/119.0.0.0\"\n        response = self._requests_session.get(url, stream=True, headers={\"User-Agent\": user_agent})\n        response.raise_for_status()\n        return self.convert_response(response, **kwargs)\n\n    def convert_response(\n        self, response: requests.Response, **kwargs: Any\n    ) -> DocumentConverterResult:  # TODO fix kwargs type\n        # Prepare a list of extensions to try (in order of priority)\n        ext = kwargs.get(\"file_extension\")\n        extensions = [ext] if ext is not None else []\n\n        # Guess from the mimetype\n        content_type = response.headers.get(\"content-type\", \"\").split(\";\")[0]\n        self._append_ext(extensions, mimetypes.guess_extension(content_type))\n\n        # Read the content disposition if there is one\n        content_disposition = response.headers.get(\"content-disposition\", \"\")\n        m = re.search(r\"filename=([^;]+)\", content_disposition)\n        if m:\n            base, ext = os.path.splitext(m.group(1).strip(\"\\\"'\"))\n            self._append_ext(extensions, ext)\n\n        # Read from the extension from the path\n        base, ext = os.path.splitext(urlparse(response.url).path)\n        self._append_ext(extensions, ext)\n\n        # Save the file locally to a temporary file. It will be deleted before this method exits\n        handle, temp_path = tempfile.mkstemp()\n        fh = os.fdopen(handle, \"wb\")\n        result = None\n        try:\n            # Download the file\n            for chunk in response.iter_content(chunk_size=512):\n                fh.write(chunk)\n            fh.close()\n\n            # Use puremagic to check for more extension options\n            self._append_ext(extensions, self._guess_ext_magic(temp_path))\n\n            # Convert\n            result = self._convert(temp_path, extensions, url=response.url)\n        except Exception as e:\n            print(f\"Error in converting: {e}\")\n\n        # Clean up\n        finally:\n            try:\n                fh.close()\n            except Exception:\n                pass\n            os.unlink(temp_path)\n\n        return result\n\n    def _convert(self, local_path: str, extensions: list[str | None], **kwargs) -> DocumentConverterResult:\n        error_trace = \"\"\n        for ext in extensions + [None]:  # Try last with no extension\n            for converter in self._page_converters:\n                _kwargs = copy.deepcopy(kwargs)\n\n                # Overwrite file_extension appropriately\n                if ext is None:\n                    if \"file_extension\" in _kwargs:\n                        del _kwargs[\"file_extension\"]\n                else:\n                    _kwargs.update({\"file_extension\": ext})\n\n                # Copy any additional global options\n                if \"mlm_client\" not in _kwargs and self._mlm_client is not None:\n                    _kwargs[\"mlm_client\"] = self._mlm_client\n\n                if \"mlm_model\" not in _kwargs and self._mlm_model is not None:\n                    _kwargs[\"mlm_model\"] = self._mlm_model\n\n                # If we hit an error log it and keep trying\n                try:\n                    res = converter.convert(local_path, **_kwargs)\n                except Exception:\n                    error_trace = (\"\\n\\n\" + traceback.format_exc()).strip()\n\n                if res is not None:\n                    # Normalize the content\n                    res.text_content = \"\\n\".join([line.rstrip() for line in re.split(r\"\\r?\\n\", res.text_content)])\n                    res.text_content = re.sub(r\"\\n{3,}\", \"\\n\\n\", res.text_content)\n\n                    # Todo\n                    return res\n\n        # If we got this far without success, report any exceptions\n        if len(error_trace) > 0:\n            raise FileConversionException(\n                f\"Could not convert '{local_path}' to Markdown. File type was recognized as {extensions}. While converting the file, the following error was encountered:\\n\\n{error_trace}\"\n            )\n\n        # Nothing can handle it!\n        raise UnsupportedFormatException(\n            f\"Could not convert '{local_path}' to Markdown. The formats {extensions} are not supported.\"\n        )\n\n    def _append_ext(self, extensions, ext):\n        \"\"\"Append a unique non-None, non-empty extension to a list of extensions.\"\"\"\n        if ext is None:\n            return\n        ext = ext.strip()\n        if ext == \"\":\n            return\n        # if ext not in extensions:\n        if True:\n            extensions.append(ext)\n\n    def _guess_ext_magic(self, path):\n        \"\"\"Use puremagic (a Python implementation of libmagic) to guess a file's extension based on the first few bytes.\"\"\"\n        # Use puremagic to guess\n        try:\n            guesses = puremagic.magic_file(path)\n            if len(guesses) > 0:\n                ext = guesses[0].extension.strip()\n                if len(ext) > 0:\n                    return ext\n        except FileNotFoundError:\n            pass\n        except IsADirectoryError:\n            pass\n        except PermissionError:\n            pass\n        return None\n\n    def register_page_converter(self, converter: DocumentConverter) -> None:\n        \"\"\"Register a page text converter.\"\"\"\n        self._page_converters.insert(0, converter)\n"
  },
  {
    "path": "examples/open_deep_research/scripts/reformulator.py",
    "content": "# Shamelessly stolen from Microsoft Autogen team: thanks to them for this great resource!\n# https://github.com/microsoft/autogen/blob/gaia_multiagent_v01_march_1st/autogen/browser_utils.py\nimport copy\n\nfrom smolagents.models import MessageRole, Model\n\n\ndef prepare_response(original_task: str, inner_messages, reformulation_model: Model) -> str:\n    messages = [\n        {\n            \"role\": MessageRole.SYSTEM,\n            \"content\": [\n                {\n                    \"type\": \"text\",\n                    \"text\": f\"\"\"Earlier you were asked the following:\n\n{original_task}\n\nYour team then worked diligently to address that request. Read below a transcript of that conversation:\"\"\",\n                }\n            ],\n        }\n    ]\n\n    # The first message just repeats the question, so remove it\n    # if len(inner_messages) > 1:\n    #    del inner_messages[0]\n\n    # copy them to this context\n    try:\n        for message in inner_messages:\n            if not message.content:\n                continue\n            message = copy.deepcopy(message)\n            message.role = MessageRole.USER\n            messages.append(message)\n    except Exception:\n        messages += [{\"role\": MessageRole.ASSISTANT, \"content\": str(inner_messages)}]\n\n    # ask for the final answer\n    messages.append(\n        {\n            \"role\": MessageRole.USER,\n            \"content\": [\n                {\n                    \"type\": \"text\",\n                    \"text\": f\"\"\"\nRead the above conversation and output a FINAL ANSWER to the question. The question is repeated here for convenience:\n\n{original_task}\n\nTo output the final answer, use the following template: FINAL ANSWER: [YOUR FINAL ANSWER]\nYour FINAL ANSWER should be a number OR as few words as possible OR a comma separated list of numbers and/or strings.\nADDITIONALLY, your FINAL ANSWER MUST adhere to any formatting instructions specified in the original question (e.g., alphabetization, sequencing, units, rounding, decimal places, etc.)\nIf you are asked for a number, express it numerically (i.e., with digits rather than words), don't use commas, and DO NOT INCLUDE UNITS such as $ or USD or percent signs unless specified otherwise.\nIf you are asked for a string, don't use articles or abbreviations (e.g. for cities), unless specified otherwise. Don't output any final sentence punctuation such as '.', '!', or '?'.\nIf you are asked for a comma separated list, apply the above rules depending on whether the elements are numbers or strings.\nIf you are unable to determine the final answer, output 'FINAL ANSWER: Unable to determine'\n\"\"\",\n                }\n            ],\n        }\n    )\n\n    response = reformulation_model(messages).content\n\n    final_answer = response.split(\"FINAL ANSWER: \")[-1].strip()\n    print(\"> Reformulated answer: \", final_answer)\n\n    #     if \"unable to determine\" in final_answer.lower():\n    #         messages.append({\"role\": MessageRole.ASSISTANT, \"content\": response })\n    #         messages.append({\"role\": MessageRole.USER, \"content\": [{\"type\": \"text\", \"text\": \"\"\"\n    # I understand that a definitive answer could not be determined. Please make a well-informed EDUCATED GUESS based on the conversation.\n\n    # To output the educated guess, use the following template: EDUCATED GUESS: [YOUR EDUCATED GUESS]\n    # Your EDUCATED GUESS should be a number OR as few words as possible OR a comma separated list of numbers and/or strings. DO NOT OUTPUT 'I don't know', 'Unable to determine', etc.\n    # ADDITIONALLY, your EDUCATED GUESS MUST adhere to any formatting instructions specified in the original question (e.g., alphabetization, sequencing, units, rounding, decimal places, etc.)\n    # If you are asked for a number, express it numerically (i.e., with digits rather than words), don't use commas, and don't include units such as $ or percent signs unless specified otherwise.\n    # If you are asked for a string, don't use articles or abbreviations (e.g. cit for cities), unless specified otherwise. Don't output any final sentence punctuation such as '.', '!', or '?'.\n    # If you are asked for a comma separated list, apply the above rules depending on whether the elements are numbers or strings.\n    # \"\"\".strip()}]})\n\n    #         response = model(messages).content\n    #         print(\"\\n>>>Making an educated guess.\\n\", response)\n    #         final_answer = response.split(\"EDUCATED GUESS: \")[-1].strip()\n    return final_answer\n"
  },
  {
    "path": "examples/open_deep_research/scripts/run_agents.py",
    "content": "import json\nimport os\nimport shutil\nimport textwrap\nfrom pathlib import Path\n\n# import tqdm.asyncio\nfrom smolagents.utils import AgentError\n\n\ndef serialize_agent_error(obj):\n    if isinstance(obj, AgentError):\n        return {\"error_type\": obj.__class__.__name__, \"message\": obj.message}\n    else:\n        return str(obj)\n\n\ndef get_image_description(file_name: str, question: str, visual_inspection_tool) -> str:\n    prompt = f\"\"\"Write a caption of 5 sentences for this image. Pay special attention to any details that might be useful for someone answering the following question:\n{question}. But do not try to answer the question directly!\nDo not add any information that is not present in the image.\"\"\"\n    return visual_inspection_tool(image_path=file_name, question=prompt)\n\n\ndef get_document_description(file_path: str, question: str, document_inspection_tool) -> str:\n    prompt = f\"\"\"Write a caption of 5 sentences for this document. Pay special attention to any details that might be useful for someone answering the following question:\n{question}. But do not try to answer the question directly!\nDo not add any information that is not present in the document.\"\"\"\n    return document_inspection_tool.forward_initial_exam_mode(file_path=file_path, question=prompt)\n\n\ndef get_single_file_description(file_path: str, question: str, visual_inspection_tool, document_inspection_tool):\n    file_extension = file_path.split(\".\")[-1]\n    if file_extension in [\"png\", \"jpg\", \"jpeg\"]:\n        file_description = f\" - Attached image: {file_path}\"\n        file_description += (\n            f\"\\n     -> Image description: {get_image_description(file_path, question, visual_inspection_tool)}\"\n        )\n        return file_description\n    elif file_extension in [\"pdf\", \"xls\", \"xlsx\", \"docx\", \"doc\", \"xml\"]:\n        image_path = file_path.split(\".\")[0] + \".png\"\n        if os.path.exists(image_path):\n            description = get_image_description(image_path, question, visual_inspection_tool)\n            file_path = image_path\n        else:\n            description = get_document_description(file_path, question, document_inspection_tool)\n        file_description = f\" - Attached document: {file_path}\"\n        file_description += f\"\\n     -> File description: {description}\"\n        return file_description\n    elif file_extension in [\"mp3\", \"m4a\", \"wav\"]:\n        return f\" - Attached audio: {file_path}\"\n    else:\n        return f\" - Attached file: {file_path}\"\n\n\ndef get_zip_description(file_path: str, question: str, visual_inspection_tool, document_inspection_tool):\n    folder_path = file_path.replace(\".zip\", \"\")\n    os.makedirs(folder_path, exist_ok=True)\n    shutil.unpack_archive(file_path, folder_path)\n\n    prompt_use_files = \"\"\n    for root, dirs, files in os.walk(folder_path):\n        for file in files:\n            file_path = os.path.join(root, file)\n            prompt_use_files += \"\\n\" + textwrap.indent(\n                get_single_file_description(file_path, question, visual_inspection_tool, document_inspection_tool),\n                prefix=\"    \",\n            )\n    return prompt_use_files\n\n\ndef get_tasks_to_run(data, total: int, base_filename: Path, tasks_ids: list[int]):\n    f = base_filename.parent / f\"{base_filename.stem}_answers.jsonl\"\n    done = set()\n    if f.exists():\n        with open(f, encoding=\"utf-8\") as fh:\n            done = {json.loads(line)[\"task_id\"] for line in fh if line.strip()}\n\n    tasks = []\n    for i in range(total):\n        task_id = int(data[i][\"task_id\"])\n        if task_id not in done:\n            if tasks_ids is not None:\n                if task_id in tasks_ids:\n                    tasks.append(data[i])\n            else:\n                tasks.append(data[i])\n    return tasks\n"
  },
  {
    "path": "examples/open_deep_research/scripts/text_inspector_tool.py",
    "content": "from smolagents import Tool\nfrom smolagents.models import Model\n\n\nclass TextInspectorTool(Tool):\n    name = \"inspect_file_as_text\"\n    description = \"\"\"\nYou cannot load files yourself: instead call this tool to read a file as markdown text and ask questions about it.\nThis tool handles the following file extensions: [\".html\", \".htm\", \".xlsx\", \".pptx\", \".wav\", \".mp3\", \".m4a\", \".flac\", \".pdf\", \".docx\"], and all other types of text files. IT DOES NOT HANDLE IMAGES.\"\"\"\n\n    inputs = {\n        \"file_path\": {\n            \"description\": \"The path to the file you want to read as text. Must be a '.something' file, like '.pdf'. If it is an image, use the visualizer tool instead! DO NOT use this tool for an HTML webpage: use the web_search tool instead!\",\n            \"type\": \"string\",\n        },\n        \"question\": {\n            \"description\": \"[Optional]: Your question, as a natural language sentence. Provide as much context as possible. Do not pass this parameter if you just want to directly return the content of the file.\",\n            \"type\": \"string\",\n            \"nullable\": True,\n        },\n    }\n    output_type = \"string\"\n\n    def __init__(self, model: Model = None, text_limit: int = 100000):\n        super().__init__()\n        self.model = model\n        self.text_limit = text_limit\n        from .mdconvert import MarkdownConverter\n\n        self.md_converter = MarkdownConverter()\n\n    def forward_initial_exam_mode(self, file_path, question):\n        from smolagents.models import MessageRole\n\n        result = self.md_converter.convert(file_path)\n\n        if file_path[-4:] in [\".png\", \".jpg\"]:\n            raise Exception(\"Cannot use inspect_file_as_text tool with images: use visualizer instead!\")\n\n        if \".zip\" in file_path:\n            return result.text_content\n\n        if not question:\n            return result.text_content\n\n        if len(result.text_content) < 4000:\n            return \"Document content: \" + result.text_content\n\n        messages = [\n            {\n                \"role\": MessageRole.SYSTEM,\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Here is a file:\\n### \"\n                        + str(result.title)\n                        + \"\\n\\n\"\n                        + result.text_content[: self.text_limit],\n                    }\n                ],\n            },\n            {\n                \"role\": MessageRole.USER,\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Now please write a short, 5 sentence caption for this document, that could help someone asking this question: \"\n                        + question\n                        + \"\\n\\nDon't answer the question yourself! Just provide useful notes on the document\",\n                    }\n                ],\n            },\n        ]\n        return self.model(messages).content\n\n    def forward(self, file_path, question: str | None = None) -> str:\n        from smolagents.models import MessageRole\n\n        result = self.md_converter.convert(file_path)\n\n        if file_path[-4:] in [\".png\", \".jpg\"]:\n            raise Exception(\"Cannot use inspect_file_as_text tool with images: use visualizer instead!\")\n\n        if \".zip\" in file_path:\n            return result.text_content\n\n        if not question:\n            return result.text_content\n\n        messages = [\n            {\n                \"role\": MessageRole.SYSTEM,\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"You will have to write a short caption for this file, then answer this question:\"\n                        + question,\n                    }\n                ],\n            },\n            {\n                \"role\": MessageRole.USER,\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Here is the complete file:\\n### \"\n                        + str(result.title)\n                        + \"\\n\\n\"\n                        + result.text_content[: self.text_limit],\n                    }\n                ],\n            },\n            {\n                \"role\": MessageRole.USER,\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Now answer the question below. Use these three headings: '1. Short answer', '2. Extremely detailed answer', '3. Additional Context on the document and question asked'.\"\n                        + question,\n                    }\n                ],\n            },\n        ]\n        return self.model(messages).content\n"
  },
  {
    "path": "examples/open_deep_research/scripts/text_web_browser.py",
    "content": "# Shamelessly stolen from Microsoft Autogen team: thanks to them for this great resource!\n# https://github.com/microsoft/autogen/blob/gaia_multiagent_v01_march_1st/autogen/browser_utils.py\nimport mimetypes\nimport os\nimport pathlib\nimport re\nimport time\nimport uuid\nfrom typing import Any\nfrom urllib.parse import unquote, urljoin, urlparse\n\nimport pathvalidate\nimport requests\nfrom serpapi import GoogleSearch\n\nfrom smolagents import Tool\n\nfrom .cookies import COOKIES\nfrom .mdconvert import FileConversionException, MarkdownConverter, UnsupportedFormatException\n\n\nclass SimpleTextBrowser:\n    \"\"\"(In preview) An extremely simple text-based web browser comparable to Lynx. Suitable for Agentic use.\"\"\"\n\n    def __init__(\n        self,\n        start_page: str | None = None,\n        viewport_size: int | None = 1024 * 8,\n        downloads_folder: str | None | None = None,\n        serpapi_key: str | None | None = None,\n        request_kwargs: dict[str, Any] | None | None = None,\n    ):\n        self.start_page: str = start_page if start_page else \"about:blank\"\n        self.viewport_size = viewport_size  # Applies only to the standard uri types\n        self.downloads_folder = downloads_folder\n        self.history: list[tuple[str, float]] = list()\n        self.page_title: str | None = None\n        self.viewport_current_page = 0\n        self.viewport_pages: list[tuple[int, int]] = list()\n        self.set_address(self.start_page)\n        self.serpapi_key = serpapi_key\n        self.request_kwargs = request_kwargs\n        self.request_kwargs[\"cookies\"] = COOKIES\n        self._mdconvert = MarkdownConverter()\n        self._page_content: str = \"\"\n\n        self._find_on_page_query: str | None = None\n        self._find_on_page_last_result: int | None = None  # Location of the last result\n\n    @property\n    def address(self) -> str:\n        \"\"\"Return the address of the current page.\"\"\"\n        return self.history[-1][0]\n\n    def set_address(self, uri_or_path: str, filter_year: int | None = None) -> None:\n        # TODO: Handle anchors\n        self.history.append((uri_or_path, time.time()))\n\n        # Handle special URIs\n        if uri_or_path == \"about:blank\":\n            self._set_page_content(\"\")\n        elif uri_or_path.startswith(\"google:\"):\n            self._serpapi_search(uri_or_path[len(\"google:\") :].strip(), filter_year=filter_year)\n        else:\n            if (\n                not uri_or_path.startswith(\"http:\")\n                and not uri_or_path.startswith(\"https:\")\n                and not uri_or_path.startswith(\"file:\")\n            ):\n                if len(self.history) > 1:\n                    prior_address = self.history[-2][0]\n                    uri_or_path = urljoin(prior_address, uri_or_path)\n                    # Update the address with the fully-qualified path\n                    self.history[-1] = (uri_or_path, self.history[-1][1])\n            self._fetch_page(uri_or_path)\n\n        self.viewport_current_page = 0\n        self.find_on_page_query = None\n        self.find_on_page_viewport = None\n\n    @property\n    def viewport(self) -> str:\n        \"\"\"Return the content of the current viewport.\"\"\"\n        bounds = self.viewport_pages[self.viewport_current_page]\n        return self.page_content[bounds[0] : bounds[1]]\n\n    @property\n    def page_content(self) -> str:\n        \"\"\"Return the full contents of the current page.\"\"\"\n        return self._page_content\n\n    def _set_page_content(self, content: str) -> None:\n        \"\"\"Sets the text content of the current page.\"\"\"\n        self._page_content = content\n        self._split_pages()\n        if self.viewport_current_page >= len(self.viewport_pages):\n            self.viewport_current_page = len(self.viewport_pages) - 1\n\n    def page_down(self) -> None:\n        self.viewport_current_page = min(self.viewport_current_page + 1, len(self.viewport_pages) - 1)\n\n    def page_up(self) -> None:\n        self.viewport_current_page = max(self.viewport_current_page - 1, 0)\n\n    def find_on_page(self, query: str) -> str | None:\n        \"\"\"Searches for the query from the current viewport forward, looping back to the start if necessary.\"\"\"\n\n        # Did we get here via a previous find_on_page search with the same query?\n        # If so, map to find_next\n        if query == self._find_on_page_query and self.viewport_current_page == self._find_on_page_last_result:\n            return self.find_next()\n\n        # Ok it's a new search start from the current viewport\n        self._find_on_page_query = query\n        viewport_match = self._find_next_viewport(query, self.viewport_current_page)\n        if viewport_match is None:\n            self._find_on_page_last_result = None\n            return None\n        else:\n            self.viewport_current_page = viewport_match\n            self._find_on_page_last_result = viewport_match\n            return self.viewport\n\n    def find_next(self) -> str | None:\n        \"\"\"Scroll to the next viewport that matches the query\"\"\"\n\n        if self._find_on_page_query is None:\n            return None\n\n        starting_viewport = self._find_on_page_last_result\n        if starting_viewport is None:\n            starting_viewport = 0\n        else:\n            starting_viewport += 1\n            if starting_viewport >= len(self.viewport_pages):\n                starting_viewport = 0\n\n        viewport_match = self._find_next_viewport(self._find_on_page_query, starting_viewport)\n        if viewport_match is None:\n            self._find_on_page_last_result = None\n            return None\n        else:\n            self.viewport_current_page = viewport_match\n            self._find_on_page_last_result = viewport_match\n            return self.viewport\n\n    def _find_next_viewport(self, query: str, starting_viewport: int) -> int | None:\n        \"\"\"Search for matches between the starting viewport looping when reaching the end.\"\"\"\n\n        if query is None:\n            return None\n\n        # Normalize the query, and convert to a regular expression\n        nquery = re.sub(r\"\\*\", \"__STAR__\", query)\n        nquery = \" \" + (\" \".join(re.split(r\"\\W+\", nquery))).strip() + \" \"\n        nquery = nquery.replace(\" __STAR__ \", \"__STAR__ \")  # Merge isolated stars with prior word\n        nquery = nquery.replace(\"__STAR__\", \".*\").lower()\n\n        if nquery.strip() == \"\":\n            return None\n\n        idxs = list()\n        idxs.extend(range(starting_viewport, len(self.viewport_pages)))\n        idxs.extend(range(0, starting_viewport))\n\n        for i in idxs:\n            bounds = self.viewport_pages[i]\n            content = self.page_content[bounds[0] : bounds[1]]\n\n            # TODO: Remove markdown links and images\n            ncontent = \" \" + (\" \".join(re.split(r\"\\W+\", content))).strip().lower() + \" \"\n            if re.search(nquery, ncontent):\n                return i\n\n        return None\n\n    def visit_page(self, path_or_uri: str, filter_year: int | None = None) -> str:\n        \"\"\"Update the address, visit the page, and return the content of the viewport.\"\"\"\n        self.set_address(path_or_uri, filter_year=filter_year)\n        return self.viewport\n\n    def _split_pages(self) -> None:\n        # Do not split search results\n        if self.address.startswith(\"google:\"):\n            self.viewport_pages = [(0, len(self._page_content))]\n            return\n\n        # Handle empty pages\n        if len(self._page_content) == 0:\n            self.viewport_pages = [(0, 0)]\n            return\n\n        # Break the viewport into pages\n        self.viewport_pages = []\n        start_idx = 0\n        while start_idx < len(self._page_content):\n            end_idx = min(start_idx + self.viewport_size, len(self._page_content))  # type: ignore[operator]\n            # Adjust to end on a space\n            while end_idx < len(self._page_content) and self._page_content[end_idx - 1] not in [\" \", \"\\t\", \"\\r\", \"\\n\"]:\n                end_idx += 1\n            self.viewport_pages.append((start_idx, end_idx))\n            start_idx = end_idx\n\n    def _serpapi_search(self, query: str, filter_year: int | None = None) -> None:\n        if self.serpapi_key is None:\n            raise ValueError(\"Missing SerpAPI key.\")\n\n        params = {\n            \"engine\": \"google\",\n            \"q\": query,\n            \"api_key\": self.serpapi_key,\n        }\n        if filter_year is not None:\n            params[\"tbs\"] = f\"cdr:1,cd_min:01/01/{filter_year},cd_max:12/31/{filter_year}\"\n\n        search = GoogleSearch(params)\n        results = search.get_dict()\n        self.page_title = f\"{query} - Search\"\n        if \"organic_results\" not in results.keys():\n            raise Exception(f\"No results found for query: '{query}'. Use a less specific query.\")\n        if len(results[\"organic_results\"]) == 0:\n            year_filter_message = f\" with filter year={filter_year}\" if filter_year is not None else \"\"\n            self._set_page_content(\n                f\"No results found for '{query}'{year_filter_message}. Try with a more general query, or remove the year filter.\"\n            )\n            return\n\n        def _prev_visit(url):\n            for i in range(len(self.history) - 1, -1, -1):\n                if self.history[i][0] == url:\n                    return f\"You previously visited this page {round(time.time() - self.history[i][1])} seconds ago.\\n\"\n            return \"\"\n\n        web_snippets: list[str] = list()\n        idx = 0\n        if \"organic_results\" in results:\n            for page in results[\"organic_results\"]:\n                idx += 1\n                date_published = \"\"\n                if \"date\" in page:\n                    date_published = \"\\nDate published: \" + page[\"date\"]\n\n                source = \"\"\n                if \"source\" in page:\n                    source = \"\\nSource: \" + page[\"source\"]\n\n                snippet = \"\"\n                if \"snippet\" in page:\n                    snippet = \"\\n\" + page[\"snippet\"]\n\n                redacted_version = f\"{idx}. [{page['title']}]({page['link']}){date_published}{source}\\n{_prev_visit(page['link'])}{snippet}\"\n\n                redacted_version = redacted_version.replace(\"Your browser can't play this video.\", \"\")\n                web_snippets.append(redacted_version)\n\n        content = (\n            f\"A Google search for '{query}' found {len(web_snippets)} results:\\n\\n## Web Results\\n\"\n            + \"\\n\\n\".join(web_snippets)\n        )\n\n        self._set_page_content(content)\n\n    def _fetch_page(self, url: str) -> None:\n        download_path = \"\"\n        try:\n            if url.startswith(\"file://\"):\n                download_path = os.path.normcase(os.path.normpath(unquote(url[7:])))\n                res = self._mdconvert.convert_local(download_path)\n                self.page_title = res.title\n                self._set_page_content(res.text_content)\n            else:\n                # Prepare the request parameters\n                request_kwargs = self.request_kwargs.copy() if self.request_kwargs is not None else {}\n                request_kwargs[\"stream\"] = True\n\n                # Send a HTTP request to the URL\n                response = requests.get(url, **request_kwargs)\n                response.raise_for_status()\n\n                # If the HTTP request was successful\n                content_type = response.headers.get(\"content-type\", \"\")\n\n                # Text or HTML\n                if \"text/\" in content_type.lower():\n                    res = self._mdconvert.convert_response(response)\n                    self.page_title = res.title\n                    self._set_page_content(res.text_content)\n                # A download\n                else:\n                    # Try producing a safe filename\n                    fname = None\n                    download_path = None\n                    try:\n                        fname = pathvalidate.sanitize_filename(os.path.basename(urlparse(url).path)).strip()\n                        download_path = os.path.abspath(os.path.join(self.downloads_folder, fname))\n\n                        suffix = 0\n                        while os.path.exists(download_path) and suffix < 1000:\n                            suffix += 1\n                            base, ext = os.path.splitext(fname)\n                            new_fname = f\"{base}__{suffix}{ext}\"\n                            download_path = os.path.abspath(os.path.join(self.downloads_folder, new_fname))\n\n                    except NameError:\n                        pass\n\n                    # No suitable name, so make one\n                    if fname is None:\n                        extension = mimetypes.guess_extension(content_type)\n                        if extension is None:\n                            extension = \".download\"\n                        fname = str(uuid.uuid4()) + extension\n                        download_path = os.path.abspath(os.path.join(self.downloads_folder, fname))\n\n                    # Open a file for writing\n                    with open(download_path, \"wb\") as fh:\n                        for chunk in response.iter_content(chunk_size=512):\n                            fh.write(chunk)\n\n                    # Render it\n                    local_uri = pathlib.Path(download_path).as_uri()\n                    self.set_address(local_uri)\n\n        except UnsupportedFormatException as e:\n            print(e)\n            self.page_title = (\"Download complete.\",)\n            self._set_page_content(f\"# Download complete\\n\\nSaved file to '{download_path}'\")\n        except FileConversionException as e:\n            print(e)\n            self.page_title = (\"Download complete.\",)\n            self._set_page_content(f\"# Download complete\\n\\nSaved file to '{download_path}'\")\n        except FileNotFoundError:\n            self.page_title = \"Error 404\"\n            self._set_page_content(f\"## Error 404\\n\\nFile not found: {download_path}\")\n        except requests.exceptions.RequestException as request_exception:\n            try:\n                self.page_title = f\"Error {response.status_code}\"\n\n                # If the error was rendered in HTML we might as well render it\n                content_type = response.headers.get(\"content-type\", \"\")\n                if content_type is not None and \"text/html\" in content_type.lower():\n                    res = self._mdconvert.convert(response)\n                    self.page_title = f\"Error {response.status_code}\"\n                    self._set_page_content(f\"## Error {response.status_code}\\n\\n{res.text_content}\")\n                else:\n                    text = \"\"\n                    for chunk in response.iter_content(chunk_size=512, decode_unicode=True):\n                        text += chunk\n                    self.page_title = f\"Error {response.status_code}\"\n                    self._set_page_content(f\"## Error {response.status_code}\\n\\n{text}\")\n            except NameError:\n                self.page_title = \"Error\"\n                self._set_page_content(f\"## Error\\n\\n{str(request_exception)}\")\n\n    def _state(self) -> tuple[str, str]:\n        header = f\"Address: {self.address}\\n\"\n        if self.page_title is not None:\n            header += f\"Title: {self.page_title}\\n\"\n\n        current_page = self.viewport_current_page\n        total_pages = len(self.viewport_pages)\n\n        address = self.address\n        for i in range(len(self.history) - 2, -1, -1):  # Start from the second last\n            if self.history[i][0] == address:\n                header += f\"You previously visited this page {round(time.time() - self.history[i][1])} seconds ago.\\n\"\n                break\n\n        header += f\"Viewport position: Showing page {current_page + 1} of {total_pages}.\\n\"\n        return (header, self.viewport)\n\n\nclass SearchInformationTool(Tool):\n    name = \"web_search\"\n    description = \"Perform a web search query (think a google search) and returns the search results.\"\n    inputs = {\"query\": {\"type\": \"string\", \"description\": \"The web search query to perform.\"}}\n    inputs[\"filter_year\"] = {\n        \"type\": \"string\",\n        \"description\": \"[Optional parameter]: filter the search results to only include pages from a specific year. For example, '2020' will only include pages from 2020. Make sure to use this parameter if you're trying to search for articles from a specific date!\",\n        \"nullable\": True,\n    }\n    output_type = \"string\"\n\n    def __init__(self, browser):\n        super().__init__()\n        self.browser = browser\n\n    def forward(self, query: str, filter_year: int | None = None) -> str:\n        self.browser.visit_page(f\"google: {query}\", filter_year=filter_year)\n        header, content = self.browser._state()\n        return header.strip() + \"\\n=======================\\n\" + content\n\n\nclass VisitTool(Tool):\n    name = \"visit_page\"\n    description = \"Visit a webpage at a given URL and return its text. Given a url to a YouTube video, this returns the transcript.\"\n    inputs = {\"url\": {\"type\": \"string\", \"description\": \"The relative or absolute url of the webpage to visit.\"}}\n    output_type = \"string\"\n\n    def __init__(self, browser=None):\n        super().__init__()\n        self.browser = browser\n\n    def forward(self, url: str) -> str:\n        self.browser.visit_page(url)\n        header, content = self.browser._state()\n        return header.strip() + \"\\n=======================\\n\" + content\n\n\nclass DownloadTool(Tool):\n    name = \"download_file\"\n    description = \"\"\"\nDownload a file at a given URL. The file should be of this format: [\".xlsx\", \".pptx\", \".wav\", \".mp3\", \".m4a\", \".png\", \".docx\"]\nAfter using this tool, for further inspection of this page you should return the download path to your manager via final_answer, and they will be able to inspect it.\nDO NOT use this tool for .pdf or .txt or .htm files: for these types of files use visit_page with the file url instead.\"\"\"\n    inputs = {\"url\": {\"type\": \"string\", \"description\": \"The relative or absolute url of the file to be downloaded.\"}}\n    output_type = \"string\"\n\n    def __init__(self, browser):\n        super().__init__()\n        self.browser = browser\n\n    def forward(self, url: str) -> str:\n        import requests\n\n        if \"arxiv\" in url:\n            url = url.replace(\"abs\", \"pdf\")\n        response = requests.get(url)\n        content_type = response.headers.get(\"content-type\", \"\")\n        extension = mimetypes.guess_extension(content_type)\n        if extension and isinstance(extension, str):\n            new_path = f\"./downloads/file{extension}\"\n        else:\n            new_path = \"./downloads/file.object\"\n\n        with open(new_path, \"wb\") as f:\n            f.write(response.content)\n\n        if \"pdf\" in extension or \"txt\" in extension or \"htm\" in extension:\n            raise Exception(\"Do not use this tool for pdf or txt or html files: use visit_page instead.\")\n\n        return f\"File was downloaded and saved under path {new_path}.\"\n\n\nclass ArchiveSearchTool(Tool):\n    name = \"find_archived_url\"\n    description = \"Given a url, searches the Wayback Machine and returns the archived version of the url that's closest in time to the desired date.\"\n    inputs = {\n        \"url\": {\"type\": \"string\", \"description\": \"The url you need the archive for.\"},\n        \"date\": {\n            \"type\": \"string\",\n            \"description\": \"The date that you want to find the archive for. Give this date in the format 'YYYYMMDD', for instance '27 June 2008' is written as '20080627'.\",\n        },\n    }\n    output_type = \"string\"\n\n    def __init__(self, browser=None):\n        super().__init__()\n        self.browser = browser\n\n    def forward(self, url, date) -> str:\n        import requests\n\n        no_timestamp_url = f\"https://archive.org/wayback/available?url={url}\"\n        archive_url = no_timestamp_url + f\"&timestamp={date}\"\n        response = requests.get(archive_url).json()\n        response_notimestamp = requests.get(no_timestamp_url).json()\n        if \"archived_snapshots\" in response and \"closest\" in response[\"archived_snapshots\"]:\n            closest = response[\"archived_snapshots\"][\"closest\"]\n            print(\"Archive found!\", closest)\n\n        elif \"archived_snapshots\" in response_notimestamp and \"closest\" in response_notimestamp[\"archived_snapshots\"]:\n            closest = response_notimestamp[\"archived_snapshots\"][\"closest\"]\n            print(\"Archive found!\", closest)\n        else:\n            raise Exception(f\"Your {url=} was not archived on Wayback Machine, try a different url.\")\n        target_url = closest[\"url\"]\n        self.browser.visit_page(target_url)\n        header, content = self.browser._state()\n        return (\n            f\"Web archive for url {url}, snapshot taken at date {closest['timestamp'][:8]}:\\n\"\n            + header.strip()\n            + \"\\n=======================\\n\"\n            + content\n        )\n\n\nclass PageUpTool(Tool):\n    name = \"page_up\"\n    description = \"Scroll the viewport UP one page-length in the current webpage and return the new viewport content.\"\n    inputs = {}\n    output_type = \"string\"\n\n    def __init__(self, browser=None):\n        super().__init__()\n        self.browser = browser\n\n    def forward(self) -> str:\n        self.browser.page_up()\n        header, content = self.browser._state()\n        return header.strip() + \"\\n=======================\\n\" + content\n\n\nclass PageDownTool(Tool):\n    name = \"page_down\"\n    description = (\n        \"Scroll the viewport DOWN one page-length in the current webpage and return the new viewport content.\"\n    )\n    inputs = {}\n    output_type = \"string\"\n\n    def __init__(self, browser=None):\n        super().__init__()\n        self.browser = browser\n\n    def forward(self) -> str:\n        self.browser.page_down()\n        header, content = self.browser._state()\n        return header.strip() + \"\\n=======================\\n\" + content\n\n\nclass FinderTool(Tool):\n    name = \"find_on_page_ctrl_f\"\n    description = \"Scroll the viewport to the first occurrence of the search string. This is equivalent to Ctrl+F.\"\n    inputs = {\n        \"search_string\": {\n            \"type\": \"string\",\n            \"description\": \"The string to search for on the page. This search string supports wildcards like '*'\",\n        }\n    }\n    output_type = \"string\"\n\n    def __init__(self, browser=None):\n        super().__init__()\n        self.browser = browser\n\n    def forward(self, search_string: str) -> str:\n        find_result = self.browser.find_on_page(search_string)\n        header, content = self.browser._state()\n\n        if find_result is None:\n            return (\n                header.strip()\n                + f\"\\n=======================\\nThe search string '{search_string}' was not found on this page.\"\n            )\n        else:\n            return header.strip() + \"\\n=======================\\n\" + content\n\n\nclass FindNextTool(Tool):\n    name = \"find_next\"\n    description = \"Scroll the viewport to next occurrence of the search string. This is equivalent to finding the next match in a Ctrl+F search.\"\n    inputs = {}\n    output_type = \"string\"\n\n    def __init__(self, browser=None):\n        super().__init__()\n        self.browser = browser\n\n    def forward(self) -> str:\n        find_result = self.browser.find_next()\n        header, content = self.browser._state()\n\n        if find_result is None:\n            return header.strip() + \"\\n=======================\\nThe search string was not found on this page.\"\n        else:\n            return header.strip() + \"\\n=======================\\n\" + content\n"
  },
  {
    "path": "examples/open_deep_research/scripts/visual_qa.py",
    "content": "import base64\nimport json\nimport mimetypes\nimport os\nimport uuid\nfrom io import BytesIO\n\nimport PIL.Image\nimport requests\nfrom dotenv import load_dotenv\nfrom huggingface_hub import InferenceClient\n\nfrom smolagents import Tool, tool\n\n\nload_dotenv(override=True)\n\n\ndef process_images_and_text(image_path, query, client):\n    from transformers import AutoProcessor\n\n    messages = [\n        {\n            \"role\": \"user\",\n            \"content\": [\n                {\"type\": \"image\"},\n                {\"type\": \"text\", \"text\": query},\n            ],\n        },\n    ]\n    idefics_processor = AutoProcessor.from_pretrained(\"HuggingFaceM4/idefics2-8b-chatty\")\n    prompt_with_template = idefics_processor.apply_chat_template(messages, add_generation_prompt=True)\n\n    # load images from local directory\n\n    # encode images to strings which can be sent to the endpoint\n    def encode_local_image(image_path):\n        # load image\n        image = PIL.Image.open(image_path).convert(\"RGB\")\n\n        # Convert the image to a base64 string\n        buffer = BytesIO()\n        image.save(buffer, format=\"JPEG\")  # Use the appropriate format (e.g., JPEG, PNG)\n        base64_image = base64.b64encode(buffer.getvalue()).decode(\"utf-8\")\n\n        # add string formatting required by the endpoint\n        image_string = f\"data:image/jpeg;base64,{base64_image}\"\n\n        return image_string\n\n    image_string = encode_local_image(image_path)\n    prompt_with_images = prompt_with_template.replace(\"<image>\", \"![]({}) \").format(image_string)\n\n    payload = {\n        \"inputs\": prompt_with_images,\n        \"parameters\": {\n            \"return_full_text\": False,\n            \"max_new_tokens\": 200,\n        },\n    }\n\n    return json.loads(client.post(json=payload).decode())[0]\n\n\n# Function to encode the image\ndef encode_image(image_path):\n    if image_path.startswith(\"http\"):\n        user_agent = \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36 Edg/119.0.0.0\"\n        request_kwargs = {\n            \"headers\": {\"User-Agent\": user_agent},\n            \"stream\": True,\n        }\n\n        # Send a HTTP request to the URL\n        response = requests.get(image_path, **request_kwargs)\n        response.raise_for_status()\n        content_type = response.headers.get(\"content-type\", \"\")\n\n        extension = mimetypes.guess_extension(content_type)\n        if extension is None:\n            extension = \".download\"\n\n        fname = str(uuid.uuid4()) + extension\n        download_path = os.path.abspath(os.path.join(\"downloads\", fname))\n\n        with open(download_path, \"wb\") as fh:\n            for chunk in response.iter_content(chunk_size=512):\n                fh.write(chunk)\n\n        image_path = download_path\n\n    with open(image_path, \"rb\") as image_file:\n        return base64.b64encode(image_file.read()).decode(\"utf-8\")\n\n\ndef resize_image(image_path):\n    img = PIL.Image.open(image_path)\n    width, height = img.size\n    img = img.resize((int(width / 2), int(height / 2)))\n    new_image_path = f\"resized_{image_path}\"\n    img.save(new_image_path)\n    return new_image_path\n\n\nclass VisualQATool(Tool):\n    name = \"visualizer\"\n    description = \"A tool that can answer questions about attached images.\"\n    inputs = {\n        \"image_path\": {\n            \"description\": \"The path to the image on which to answer the question\",\n            \"type\": \"string\",\n        },\n        \"question\": {\"description\": \"the question to answer\", \"type\": \"string\", \"nullable\": True},\n    }\n    output_type = \"string\"\n\n    client = InferenceClient(\"HuggingFaceM4/idefics2-8b-chatty\")\n\n    def forward(self, image_path: str, question: str | None = None) -> str:\n        output = \"\"\n        add_note = False\n        if not question:\n            add_note = True\n            question = \"Please write a detailed caption for this image.\"\n        try:\n            output = process_images_and_text(image_path, question, self.client)\n        except Exception as e:\n            print(e)\n            if \"Payload Too Large\" in str(e):\n                new_image_path = resize_image(image_path)\n                output = process_images_and_text(new_image_path, question, self.client)\n\n        if add_note:\n            output = (\n                f\"You did not provide a particular question, so here is a detailed caption for the image: {output}\"\n            )\n\n        return output\n\n\n@tool\ndef visualizer(image_path: str, question: str | None = None) -> str:\n    \"\"\"A tool that can answer questions about attached images.\n\n    Args:\n        image_path: The path to the image on which to answer the question. This should be a local path to downloaded image.\n        question: The question to answer.\n    \"\"\"\n    import mimetypes\n    import os\n\n    import requests\n\n    from .visual_qa import encode_image\n\n    add_note = False\n    if not question:\n        add_note = True\n        question = \"Please write a detailed caption for this image.\"\n    if not isinstance(image_path, str):\n        raise Exception(\"You should provide at least `image_path` string argument to this tool!\")\n\n    mime_type, _ = mimetypes.guess_type(image_path)\n    base64_image = encode_image(image_path)\n\n    payload = {\n        \"model\": \"gpt-4o\",\n        \"messages\": [\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\"type\": \"text\", \"text\": question},\n                    {\"type\": \"image_url\", \"image_url\": {\"url\": f\"data:{mime_type};base64,{base64_image}\"}},\n                ],\n            }\n        ],\n        \"max_tokens\": 1000,\n    }\n    headers = {\"Content-Type\": \"application/json\", \"Authorization\": f\"Bearer {os.getenv('OPENAI_API_KEY')}\"}\n    response = requests.post(\"https://api.openai.com/v1/chat/completions\", headers=headers, json=payload)\n    try:\n        output = response.json()[\"choices\"][0][\"message\"][\"content\"]\n    except Exception:\n        raise Exception(f\"Response format unexpected: {response.json()}\")\n\n    if add_note:\n        output = f\"You did not provide a particular question, so here is a detailed caption for the image: {output}\"\n\n    return output\n"
  },
  {
    "path": "examples/open_deep_research/visual_vs_text_browser.ipynb",
    "content": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Compare a text-based vs a vision-based browser\\n\",\n    \"\\n\",\n    \"Warning: this notebook is experimental, it probably won't work out of the box!\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"!pip install \\\"smolagents[litellm,toolkit]\\\" -q\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import datasets\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"eval_ds = datasets.load_dataset(\\\"gaia-benchmark/GAIA\\\", \\\"2023_all\\\")[\\\"validation\\\"]\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 3,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"to_keep = [\\n\",\n    \"    \\\"What's the last line of the rhyme under the flavor\\\",\\n\",\n    \"    'Of the authors (First M. Last) that worked on the paper \\\"Pie Menus or Linear Menus',\\n\",\n    \"    \\\"In Series 9, Episode 11 of Doctor Who, the Doctor is trapped inside an ever-shifting maze. What is this location called in the official script for the episode? Give the setting exactly as it appears in the first scene heading.\\\",\\n\",\n    \"    \\\"Which contributor to the version of OpenCV where support was added for the Mask-RCNN model has the same name as a former Chinese head of government when the names are transliterated to the Latin alphabet?\\\",\\n\",\n    \"    \\\"The photograph in the Whitney Museum of American Art's collection with accession number 2022.128 shows a person holding a book. Which military unit did the author of this book join in 1813? Answer without using articles.\\\",\\n\",\n    \"    \\\"I went to Virtue restaurant & bar in Chicago for my birthday on March 22, 2021 and the main course I had was delicious! Unfortunately, when I went back about a month later on April 21, it was no longer on the dinner menu.\\\",\\n\",\n    \"    \\\"In Emily Midkiff's June 2014 article in a journal named for the one of Hreidmar's \\\",\\n\",\n    \"    \\\"Under DDC 633 on Bielefeld University Library's BASE, as of 2020\\\",\\n\",\n    \"    \\\"In the 2018 VSCode blog post on replit.com, what was the command they clicked on in the last video to remove extra lines?\\\",\\n\",\n    \"    \\\"The Metropolitan Museum of Art has a portrait in its collection with an accession number of 29.100.5. Of the consecrators and co-consecrators\\\",\\n\",\n    \"    \\\"In Nature journal's Scientific Reports conference proceedings from 2012, in the article that did not mention plasmons or plasmonics, what nano-compound is studied?\\\",\\n\",\n    \"    'In the year 2022, and before December, what does \\\"R\\\" stand for in the three core policies of the type of content',\\n\",\n    \"    \\\"Who nominated the only Featured Article on English Wikipedia about a dinosaur that was promoted in November 2016?\\\",\\n\",\n    \"]\\n\",\n    \"eval_ds = eval_ds.filter(lambda row: any([el in row[\\\"Question\\\"] for el in to_keep]))\\n\",\n    \"eval_ds = eval_ds.rename_columns({\\\"Question\\\": \\\"question\\\", \\\"Final answer\\\": \\\"true_answer\\\", \\\"Level\\\": \\\"task\\\"})\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import os\\n\",\n    \"\\n\",\n    \"from dotenv import load_dotenv\\n\",\n    \"from huggingface_hub import login\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"load_dotenv(override=True)\\n\",\n    \"\\n\",\n    \"login(os.getenv(\\\"HF_TOKEN\\\"))\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Text browser\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"from scripts.run_agents import answer_questions\\n\",\n    \"from scripts.text_inspector_tool import TextInspectorTool\\n\",\n    \"from scripts.text_web_browser import (\\n\",\n    \"    ArchiveSearchTool,\\n\",\n    \"    FinderTool,\\n\",\n    \"    FindNextTool,\\n\",\n    \"    NavigationalSearchTool,\\n\",\n    \"    PageDownTool,\\n\",\n    \"    PageUpTool,\\n\",\n    \"    SearchInformationTool,\\n\",\n    \"    VisitTool,\\n\",\n    \")\\n\",\n    \"from scripts.visual_qa import VisualQAGPT4Tool\\n\",\n    \"\\n\",\n    \"from smolagents import CodeAgent, LiteLLMModel\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"proprietary_model = LiteLLMModel(model_id=\\\"gpt-4o\\\")\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"### BUILD AGENTS & TOOLS\\n\",\n    \"\\n\",\n    \"WEB_TOOLS = [\\n\",\n    \"    SearchInformationTool(),\\n\",\n    \"    NavigationalSearchTool(),\\n\",\n    \"    VisitTool(),\\n\",\n    \"    PageUpTool(),\\n\",\n    \"    PageDownTool(),\\n\",\n    \"    FinderTool(),\\n\",\n    \"    FindNextTool(),\\n\",\n    \"    ArchiveSearchTool(),\\n\",\n    \"]\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"surfer_agent = CodeAgent(\\n\",\n    \"    model=proprietary_model,\\n\",\n    \"    tools=WEB_TOOLS,\\n\",\n    \"    max_steps=20,\\n\",\n    \"    verbosity_level=2,\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"results_text = answer_questions(\\n\",\n    \"    eval_ds,\\n\",\n    \"    surfer_agent,\\n\",\n    \"    \\\"code_gpt4o_27-01_text\\\",\\n\",\n    \"    reformulation_model=proprietary_model,\\n\",\n    \"    output_folder=\\\"output_browsers\\\",\\n\",\n    \"    visual_inspection_tool=VisualQAGPT4Tool(),\\n\",\n    \"    text_inspector_tool=TextInspectorTool(proprietary_model, 40000),\\n\",\n    \")\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Vision browser\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"!pip install helium -q\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"from scripts.visual_qa import VisualQAGPT4Tool\\n\",\n    \"\\n\",\n    \"from smolagents import CodeAgent, LiteLLMModel, WebSearchTool\\n\",\n    \"from smolagents.vision_web_browser import (\\n\",\n    \"    close_popups,\\n\",\n    \"    go_back,\\n\",\n    \"    helium_instructions,\\n\",\n    \"    initialize_agent,\\n\",\n    \"    save_screenshot,\\n\",\n    \"    search_item_ctrl_f,\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"proprietary_model = LiteLLMModel(model_id=\\\"gpt-4o\\\")\\n\",\n    \"vision_browser_agent = initialize_agent(proprietary_model)\\n\",\n    \"### BUILD AGENTS & TOOLS\\n\",\n    \"\\n\",\n    \"CodeAgent(\\n\",\n    \"    tools=[WebSearchTool(), go_back, close_popups, search_item_ctrl_f],\\n\",\n    \"    model=proprietary_model,\\n\",\n    \"    additional_authorized_imports=[\\\"helium\\\"],\\n\",\n    \"    step_callbacks=[save_screenshot],\\n\",\n    \"    max_steps=20,\\n\",\n    \"    verbosity_level=2,\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"results_vision = answer_questions(\\n\",\n    \"    eval_ds,\\n\",\n    \"    vision_browser_agent,\\n\",\n    \"    \\\"code_gpt4o_27-01_vision\\\",\\n\",\n    \"    reformulation_model=proprietary_model,\\n\",\n    \"    output_folder=\\\"output_browsers\\\",\\n\",\n    \"    visual_inspection_tool=VisualQAGPT4Tool(),\\n\",\n    \"    text_inspector_tool=TextInspectorTool(proprietary_model, 40000),\\n\",\n    \"    postprompt=helium_instructions\\n\",\n    \"    + \\\"Any web browser controls won't work on .pdf urls, rather use the tool 'inspect_file_as_text' to read them\\\",\\n\",\n    \")\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Browser-use browser\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"!pip install browser-use lxml_html_clean -q\\n\",\n    \"!playwright install\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import asyncio\\n\",\n    \"\\n\",\n    \"import nest_asyncio\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"nest_asyncio.apply()\\n\",\n    \"\\n\",\n    \"from browser_use import Agent\\n\",\n    \"from dotenv import load_dotenv\\n\",\n    \"from langchain_openai import ChatOpenAI\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"load_dotenv()\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class BrowserUseAgent:\\n\",\n    \"    logs = []\\n\",\n    \"\\n\",\n    \"    def write_inner_memory_from_logs(self, summary_mode):\\n\",\n    \"        return self.results\\n\",\n    \"\\n\",\n    \"    def run(self, task, **kwargs):\\n\",\n    \"        agent = Agent(\\n\",\n    \"            task=task,\\n\",\n    \"            llm=ChatOpenAI(model=\\\"gpt-4o\\\"),\\n\",\n    \"        )\\n\",\n    \"        self.results = asyncio.get_event_loop().run_until_complete(agent.run())\\n\",\n    \"        return self.results.history[-1].result[0].extracted_content\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"browser_use_agent = BrowserUseAgent()\\n\",\n    \"\\n\",\n    \"results_browseruse = answer_questions(\\n\",\n    \"    eval_ds,\\n\",\n    \"    browser_use_agent,\\n\",\n    \"    \\\"gpt-4o_27-01_browseruse\\\",\\n\",\n    \"    reformulation_model=proprietary_model,\\n\",\n    \"    output_folder=\\\"output_browsers\\\",\\n\",\n    \"    visual_inspection_tool=VisualQAGPT4Tool(),\\n\",\n    \"    text_inspector_tool=TextInspectorTool(proprietary_model, 40000),\\n\",\n    \"    postprompt=\\\"\\\",\\n\",\n    \"    run_simple=True,\\n\",\n    \")\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Get results\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import pandas as pd\\n\",\n    \"from scripts.gaia_scorer import question_scorer\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"results_vision, results_text, results_browseruse = (\\n\",\n    \"    pd.DataFrame(results_vision),\\n\",\n    \"    pd.DataFrame(results_text),\\n\",\n    \"    pd.DataFrame(results_browseruse),\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"results_vision[\\\"is_correct\\\"] = results_vision.apply(\\n\",\n    \"    lambda x: question_scorer(x[\\\"prediction\\\"], x[\\\"true_answer\\\"]), axis=1\\n\",\n    \")\\n\",\n    \"results_text[\\\"is_correct\\\"] = results_text.apply(lambda x: question_scorer(x[\\\"prediction\\\"], x[\\\"true_answer\\\"]), axis=1)\\n\",\n    \"results_browseruse[\\\"is_correct\\\"] = results_browseruse.apply(\\n\",\n    \"    lambda x: question_scorer(x[\\\"prediction\\\"], x[\\\"true_answer\\\"]), axis=1\\n\",\n    \")\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"results = pd.concat([results_vision, results_text, results_browseruse])\\n\",\n    \"results.groupby(\\\"agent_name\\\")[\\\"is_correct\\\"].mean()\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"correct_vision_results = results_vision.loc[results_vision[\\\"is_correct\\\"]]\\n\",\n    \"correct_vision_results\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"false_text_results = results_text.loc[~results_text[\\\"is_correct\\\"]]\\n\",\n    \"false_text_results\"\n   ]\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"gaia\",\n   \"language\": \"python\",\n   \"name\": \"python3\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.12.0\"\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 2\n}\n"
  },
  {
    "path": "examples/plan_customization/README.md",
    "content": "# Human-in-the-Loop: Customize Agent Plan Interactively\n\nThis example demonstrates advanced usage of the smolagents library, specifically showing how to implement Human-in-the-Loop strategies to:\n\n1. **Interrupt agent execution after plan creation** using step callbacks\n2. **Allow user interaction** to review and modify plans (Human-in-the-Loop)\n3. **Resume execution** while preserving agent memory\n4. **Modify plans in real-time** based on user feedback, keeping the human in control\n\n## Human-in-the-Loop Key Features\n\n### Interactive Plan Review\n- The agent creates a plan and pauses execution\n- Users can view the complete plan before execution begins\n- Options to approve, modify, or cancel the plan\n\n### Plan Modification\n- Users can edit the agent's plan in real-time\n- Modified plans are applied to the agent's memory\n- Execution continues with the updated plan\n\n### Memory Preservation\n- Using `reset=False` preserves the agent's memory between runs\n- Demonstrates how to build on previous interactions\n- Shows memory state management across multiple executions\n- Maintains transparency and control\n\n## Usage\n\n### Basic Usage\n```python\npython plan_customization.py\n```\n\n### Key Components\n\n#### Step Callback Function\n```python\ndef interrupt_after_plan(memory_step, agent):\n    if isinstance(memory_step, PlanningStep):\n        # Display plan and get user input\n        # Modify plan if requested\n        # Continue or interrupt based on user choice\n```\n\n#### Agent Configuration\n```python\nagent = CodeAgent(\n    model=InferenceClientModel(),\n    tools=[DuckDuckGoSearchTool()],\n    planning_interval=5,  # Plan every 5 steps\n    step_callbacks={PlanningStep: interrupt_after_plan},  # Register callback for PlanningStep\n    max_steps=10,\n    verbosity_level=1\n)\n```\n\n#### Resuming Execution\n```python\n# First run - may be interrupted\nagent.run(task, reset=True)\n\n# Resume with preserved memory\nagent.run(task, reset=False)  # Keeps all previous steps\n```\n\n## Example Human-in-the-Loop Workflow\n\n1. **Agent starts** with a complex task\n2. **Planning step** is created automatically\n3. **Execution pauses** for human review - step callback triggers\n4. **Human-in-the-Loop**:\n   1. **User reviews the plan** in a formatted display\n   2. **User decides** to approve, modify, or cancel the plan\n   3. **User modifies the plan** (if requested) - user can edit the plan\n5. **Execution resumes** with approved/modified plan\n6. **Memory preservation** - all steps are maintained for future runs, maintaining transparency and control\n\n## Interactive Elements\n\n### Plan Display\n```\n============================================================\n🤖 AGENT PLAN CREATED\n============================================================\n1. Search for recent AI developments\n2. Analyze the top results\n3. Summarize the 3 most significant breakthroughs\n4. Include sources for each breakthrough\n============================================================\n```\n\n### User Choices\n```\nChoose an option:\n1. Approve plan\n2. Modify plan\n3. Cancel\nYour choice (1-3):\n```\n\n### Plan Modification Interface\n```\n----------------------------------------\nMODIFY PLAN\n----------------------------------------\nCurrent plan: [displays current plan]\n----------------------------------------\nEnter your modified plan (press Enter twice to finish):\n```\n\n## Advanced Features\n\n### Memory State Inspection\nThe example shows how to inspect the agent's memory:\n```python\nprint(f\"Current memory contains {len(agent.memory.steps)} steps:\")\nfor i, step in enumerate(agent.memory.steps):\n    step_type = type(step).__name__\n    print(f\"  {i+1}. {step_type}\")\n```\n\n### Error Handling\nProper error handling for:\n- User cancellation\n- Plan modification errors\n- Resume execution failures\n\n## Requirements\n\n- smolagents library\n- DuckDuckGoSearchTool (included with smolagents)\n- Access to InferenceClientModel (requires HuggingFace API token)\n\n## Educational Value\n\nThis example teaches:\n- **Step callback implementation** for custom agent behavior\n- **Memory management** in multi-step agents\n- **User interaction patterns** in agentic systems\n- **Plan modification techniques** for dynamic agent control\n- **Error handling** in interactive agent systems\n\nPerfect for understanding how to build interactive, user-controlled AI agents that can adapt their behavior based on human feedback.\n"
  },
  {
    "path": "examples/plan_customization/plan_customization.py",
    "content": "\"\"\"\nPlan Customization Example\n\nThis example demonstrates how to use step callbacks to interrupt the agent after\nplan creation, allow user interaction to approve or modify the plan, and then\nresume execution while preserving agent memory.\n\nKey concepts demonstrated:\n1. Step callbacks to interrupt after PlanningStep\n2. Extracting and modifying the current plan\n3. Resuming execution with reset=False to preserve memory\n4. User interaction for plan approval/modification\n\"\"\"\n\nfrom smolagents import CodeAgent, DuckDuckGoSearchTool, InferenceClientModel, PlanningStep\n\n\ndef display_plan(plan_content):\n    \"\"\"Display the plan in a formatted way\"\"\"\n    print(\"\\n\" + \"=\" * 60)\n    print(\"🤖 AGENT PLAN CREATED\")\n    print(\"=\" * 60)\n    print(plan_content)\n    print(\"=\" * 60)\n\n\ndef get_user_choice():\n    \"\"\"Get user's choice for plan approval\"\"\"\n    while True:\n        choice = input(\"\\nChoose an option:\\n1. Approve plan\\n2. Modify plan\\n3. Cancel\\nYour choice (1-3): \").strip()\n        if choice in [\"1\", \"2\", \"3\"]:\n            return int(choice)\n        print(\"Invalid choice. Please enter 1, 2, or 3.\")\n\n\ndef get_modified_plan(original_plan):\n    \"\"\"Allow user to modify the plan\"\"\"\n    print(\"\\n\" + \"-\" * 40)\n    print(\"MODIFY PLAN\")\n    print(\"-\" * 40)\n    print(\"Current plan:\")\n    print(original_plan)\n    print(\"-\" * 40)\n    print(\"Enter your modified plan (press Enter twice to finish):\")\n\n    lines = []\n    empty_line_count = 0\n\n    while empty_line_count < 2:\n        line = input()\n        if line.strip() == \"\":\n            empty_line_count += 1\n        else:\n            empty_line_count = 0\n        lines.append(line)\n\n    # Remove the last two empty lines\n    modified_plan = \"\\n\".join(lines[:-2])\n    return modified_plan if modified_plan.strip() else original_plan\n\n\ndef interrupt_after_plan(memory_step, agent):\n    \"\"\"\n    Step callback that interrupts the agent after a planning step is created.\n    This allows for user interaction to review and potentially modify the plan.\n    \"\"\"\n    if isinstance(memory_step, PlanningStep):\n        print(\"\\n🛑 Agent interrupted after plan creation...\")\n\n        # Display the created plan\n        display_plan(memory_step.plan)\n\n        # Get user choice\n        choice = get_user_choice()\n\n        if choice == 1:  # Approve plan\n            print(\"✅ Plan approved! Continuing execution...\")\n            # Don't interrupt - let the agent continue\n            return\n\n        elif choice == 2:  # Modify plan\n            # Get modified plan from user\n            modified_plan = get_modified_plan(memory_step.plan)\n\n            # Update the plan in the memory step\n            memory_step.plan = modified_plan\n\n            print(\"\\nPlan updated!\")\n            display_plan(modified_plan)\n            print(\"✅ Continuing with modified plan...\")\n            # Don't interrupt - let the agent continue with modified plan\n            return\n\n        elif choice == 3:  # Cancel\n            print(\"❌ Execution cancelled by user.\")\n            agent.interrupt()\n            return\n\n\ndef main():\n    \"\"\"Run the complete plan customization example\"\"\"\n    print(\"🚀 Starting Plan Customization Example\")\n    print(\"=\" * 60)\n\n    # Create agent with planning enabled and step callback\n    agent = CodeAgent(\n        model=InferenceClientModel(),\n        tools=[DuckDuckGoSearchTool()],  # Add a search tool for more interesting plans\n        planning_interval=5,  # Plan every 5 steps for demonstration\n        step_callbacks={PlanningStep: interrupt_after_plan},\n        max_steps=10,\n        verbosity_level=1,  # Show agent thoughts\n    )\n\n    # Define a task that will benefit from planning\n    task = \"\"\"Search for recent developments in artificial intelligence and provide a summary\n    of the top 3 most significant breakthroughs in 2024. Include the source of each breakthrough.\"\"\"\n\n    try:\n        print(f\"\\n📋 Task: {task}\")\n        print(\"\\n🤖 Agent starting execution...\")\n\n        # First run - will create plan and potentially get interrupted\n        result = agent.run(task)\n\n        # If we get here, the plan was approved or execution completed\n        print(\"\\n✅ Task completed successfully!\")\n        print(\"\\n📄 Final Result:\")\n        print(\"-\" * 40)\n        print(result)\n\n    except Exception as e:\n        if \"interrupted\" in str(e).lower():\n            print(\"\\n🛑 Agent execution was cancelled by user.\")\n            print(\"\\nTo resume execution later, you could call:\")\n            print(\"agent.run(task, reset=False)  # This preserves the agent's memory\")\n\n            # Demonstrate resuming with reset=False\n            print(\"\\n\" + \"=\" * 60)\n            print(\"DEMONSTRATION: Resuming with reset=False\")\n            print(\"=\" * 60)\n\n            # Show current memory state\n            print(f\"\\n📚 Current memory contains {len(agent.memory.steps)} steps:\")\n            for i, step in enumerate(agent.memory.steps):\n                step_type = type(step).__name__\n                print(f\"  {i + 1}. {step_type}\")\n\n            # Ask if user wants to see resume demonstration\n            resume_choice = input(\"\\nWould you like to see resume demonstration? (y/n): \").strip().lower()\n            if resume_choice == \"y\":\n                print(\"\\n🔄 Resuming execution...\")\n                try:\n                    # Resume without resetting - preserves memory\n                    agent.run(task, reset=False)\n                    print(\"\\n✅ Task completed after resume!\")\n                    print(\"\\n📄 Final Result:\")\n                    print(\"-\" * 40)\n                except Exception as resume_error:\n                    print(f\"\\n❌ Error during resume: {resume_error}\")\n                else:\n                    print(f\"\\n❌ An error occurred: {e}\")\n\n\nif __name__ == \"__main__\":\n    # Run the main example\n    main()\n"
  },
  {
    "path": "examples/rag.py",
    "content": "# from huggingface_hub import login\n\n# login()\nimport datasets\nfrom langchain.docstore.document import Document\nfrom langchain.text_splitter import RecursiveCharacterTextSplitter\nfrom langchain_community.retrievers import BM25Retriever\n\n\nknowledge_base = datasets.load_dataset(\"m-ric/huggingface_doc\", split=\"train\")\nknowledge_base = knowledge_base.filter(lambda row: row[\"source\"].startswith(\"huggingface/transformers\"))\n\nsource_docs = [\n    Document(page_content=doc[\"text\"], metadata={\"source\": doc[\"source\"].split(\"/\")[1]}) for doc in knowledge_base\n]\n\ntext_splitter = RecursiveCharacterTextSplitter(\n    chunk_size=500,\n    chunk_overlap=50,\n    add_start_index=True,\n    strip_whitespace=True,\n    separators=[\"\\n\\n\", \"\\n\", \".\", \" \", \"\"],\n)\ndocs_processed = text_splitter.split_documents(source_docs)\n\nfrom smolagents import Tool\n\n\nclass RetrieverTool(Tool):\n    name = \"retriever\"\n    description = \"Uses lexical search to retrieve the parts of transformers documentation that could be most relevant to answer your query.\"\n    inputs = {\n        \"query\": {\n            \"type\": \"string\",\n            \"description\": \"The query to perform. This should be lexically close to your target documents. Use the affirmative form rather than a question.\",\n        }\n    }\n    output_type = \"string\"\n\n    def __init__(self, docs, **kwargs):\n        super().__init__(**kwargs)\n        self.retriever = BM25Retriever.from_documents(docs, k=10)\n\n    def forward(self, query: str) -> str:\n        assert isinstance(query, str), \"Your search query must be a string\"\n\n        docs = self.retriever.invoke(\n            query,\n        )\n        return \"\\nRetrieved documents:\\n\" + \"\".join(\n            [f\"\\n\\n===== Document {str(i)} =====\\n\" + doc.page_content for i, doc in enumerate(docs)]\n        )\n\n\nfrom smolagents import CodeAgent, InferenceClientModel\n\n\nretriever_tool = RetrieverTool(docs_processed)\nagent = CodeAgent(\n    tools=[retriever_tool],\n    model=InferenceClientModel(model_id=\"Qwen/Qwen3-Next-80B-A3B-Thinking\"),\n    max_steps=4,\n    verbosity_level=2,\n    stream_outputs=True,\n)\n\nagent_output = agent.run(\"For a transformers model training, which is slower, the forward or the backward pass?\")\n\nprint(\"Final output:\")\nprint(agent_output)\n"
  },
  {
    "path": "examples/rag_using_chromadb.py",
    "content": "import os\n\nimport datasets\nfrom langchain.docstore.document import Document\nfrom langchain.text_splitter import RecursiveCharacterTextSplitter\nfrom langchain_chroma import Chroma\n\n# from langchain_community.document_loaders import PyPDFLoader\nfrom langchain_huggingface import HuggingFaceEmbeddings\nfrom tqdm import tqdm\nfrom transformers import AutoTokenizer\n\n# from langchain_openai import OpenAIEmbeddings\nfrom smolagents import LiteLLMModel, Tool\nfrom smolagents.agents import CodeAgent\n\n\n# from smolagents.agents import ToolCallingAgent\n\n\nknowledge_base = datasets.load_dataset(\"m-ric/huggingface_doc\", split=\"train\")\n\nsource_docs = [\n    Document(page_content=doc[\"text\"], metadata={\"source\": doc[\"source\"].split(\"/\")[1]}) for doc in knowledge_base\n]\n\n## For your own PDFs, you can use the following code to load them into source_docs\n# pdf_directory = \"pdfs\"\n# pdf_files = [\n#     os.path.join(pdf_directory, f)\n#     for f in os.listdir(pdf_directory)\n#     if f.endswith(\".pdf\")\n# ]\n# source_docs = []\n\n# for file_path in pdf_files:\n#     loader = PyPDFLoader(file_path)\n#     docs.extend(loader.load())\n\ntext_splitter = RecursiveCharacterTextSplitter.from_huggingface_tokenizer(\n    AutoTokenizer.from_pretrained(\"thenlper/gte-small\"),\n    chunk_size=200,\n    chunk_overlap=20,\n    add_start_index=True,\n    strip_whitespace=True,\n    separators=[\"\\n\\n\", \"\\n\", \".\", \" \", \"\"],\n)\n\n# Split docs and keep only unique ones\nprint(\"Splitting documents...\")\ndocs_processed = []\nunique_texts = {}\nfor doc in tqdm(source_docs):\n    new_docs = text_splitter.split_documents([doc])\n    for new_doc in new_docs:\n        if new_doc.page_content not in unique_texts:\n            unique_texts[new_doc.page_content] = True\n            docs_processed.append(new_doc)\n\n\nprint(\"Embedding documents... This should take a few minutes (5 minutes on MacBook with M1 Pro)\")\n# Initialize embeddings and ChromaDB vector store\nembeddings = HuggingFaceEmbeddings(model_name=\"sentence-transformers/all-MiniLM-L6-v2\")\n\n\n# embeddings = OpenAIEmbeddings(model=\"text-embedding-3-small\")\n\nvector_store = Chroma.from_documents(docs_processed, embeddings, persist_directory=\"./chroma_db\")\n\n\nclass RetrieverTool(Tool):\n    name = \"retriever\"\n    description = (\n        \"Uses semantic search to retrieve the parts of documentation that could be most relevant to answer your query.\"\n    )\n    inputs = {\n        \"query\": {\n            \"type\": \"string\",\n            \"description\": \"The query to perform. This should be semantically close to your target documents. Use the affirmative form rather than a question.\",\n        }\n    }\n    output_type = \"string\"\n\n    def __init__(self, vector_store, **kwargs):\n        super().__init__(**kwargs)\n        self.vector_store = vector_store\n\n    def forward(self, query: str) -> str:\n        assert isinstance(query, str), \"Your search query must be a string\"\n        docs = self.vector_store.similarity_search(query, k=3)\n        return \"\\nRetrieved documents:\\n\" + \"\".join(\n            [f\"\\n\\n===== Document {str(i)} =====\\n\" + doc.page_content for i, doc in enumerate(docs)]\n        )\n\n\nretriever_tool = RetrieverTool(vector_store)\n\n# Choose which LLM engine to use!\n\n# from smolagents import InferenceClientModel\n# model = InferenceClientModel(model_id=\"Qwen/Qwen3-Next-80B-A3B-Thinking\")\n\n# from smolagents import TransformersModel\n# model = TransformersModel(model_id=\"Qwen/Qwen3-4B-Instruct-2507\")\n\n# For anthropic: change model_id below to 'anthropic/claude-4-sonnet-latest' and also change 'os.environ.get(\"ANTHROPIC_API_KEY\")'\nmodel = LiteLLMModel(\n    model_id=\"groq/openai/gpt-oss-120b\",\n    api_key=os.environ.get(\"GROQ_API_KEY\"),\n)\n\n# # You can also use the ToolCallingAgent class\n# agent = ToolCallingAgent(\n#     tools=[retriever_tool],\n#     model=model,\n#     verbose=True,\n# )\n\nagent = CodeAgent(\n    tools=[retriever_tool],\n    model=model,\n    max_steps=4,\n    verbosity_level=2,\n    stream_outputs=True,\n)\n\nagent_output = agent.run(\"How can I push a model to the Hub?\")\n\n\nprint(\"Final output:\")\nprint(agent_output)\n"
  },
  {
    "path": "examples/sandboxed_execution.py",
    "content": "from smolagents import CodeAgent, InferenceClientModel, WebSearchTool\n\n\nmodel = InferenceClientModel()\n\n# Blaxel executor example\nwith CodeAgent(tools=[WebSearchTool()], model=model, executor_type=\"blaxel\") as agent:\n    output = agent.run(\"How many seconds would it take for a leopard at full speed to run through Pont des Arts?\")\nprint(\"Blaxel executor result:\", output)\n\n# Docker executor example\nwith CodeAgent(tools=[WebSearchTool()], model=model, executor_type=\"docker\") as agent:\n    output = agent.run(\"How many seconds would it take for a leopard at full speed to run through Pont des Arts?\")\nprint(\"Docker executor result:\", output)\n\n# E2B executor example\nwith CodeAgent(tools=[WebSearchTool()], model=model, executor_type=\"e2b\") as agent:\n    output = agent.run(\"How many seconds would it take for a leopard at full speed to run through Pont des Arts?\")\nprint(\"E2B executor result:\", output)\n\n# Modal executor example\nwith CodeAgent(tools=[WebSearchTool()], model=model, executor_type=\"modal\") as agent:\n    output = agent.run(\"How many seconds would it take for a leopard at full speed to run through Pont des Arts?\")\nprint(\"Modal executor result:\", output)\n\n# WebAssembly executor example\nwith CodeAgent(tools=[], model=model, executor_type=\"wasm\") as agent:\n    output = agent.run(\"Calculate the square root of 125.\")\nprint(\"Wasm executor result:\", output)\n# TODO: Support tools\n# with CodeAgent(tools=[VisitWebpageTool()], model=model, executor_type=\"wasm\") as agent:\n#     output = agent.run(\"What is the content of the Wikipedia page at https://en.wikipedia.org/wiki/Intelligent_agent?\")\n"
  },
  {
    "path": "examples/server/README.md",
    "content": "# Smolagents Chat Server Demo\n\nThis is a simple web server that provides a chat interface for interacting with an AI code agent powered by `smolagents` and the Qwen3-Next-80B-A3B-Thinking model, enhanced with MCP (Model Control Protocol) tools.\n\n## Features\n\n- Web-based chat interface\n- AI code agent powered by Qwen2.5-Coder\n- Integration with MCP tools through MCPClient\n- Asynchronous request handling\n- Clean, responsive UI\n- Graceful shutdown handling\n\n## Requirements\n\n- Python 3.8+\n- Starlette\n- AnyIO\n- Smolagents with MCP support\n\n## Installation\n\n1. Install the required packages:\n\n```bash\npip install starlette anyio 'smolagents[mcp]' uvicorn\n```\n\n2. Optional: If you want to use a specific model, you may need additional dependencies.\n\n## Usage\n\n1. Run the server:\n\n```bash\nuvicorn examples.server.main:app --reload\n```\n\n2. Open your browser and navigate to `http://localhost:8000`\n\n3. Interact with the AI code agent through the chat interface\n\n## How It Works\n\nThe server consists of two main routes:\n- `/` - Serves the HTML page with the chat interface\n- `/chat` - API endpoint that processes messages and returns responses\n\nThe server integrates with MCP tools through the following components:\n\n1. MCPClient Configuration:\n```python\nmcp_server_parameters = {\n    \"url\": \"https://evalstate-hf-mcp-server.hf.space/mcp\",\n    \"transport\": \"streamable-http\",\n}\nmcp_client = MCPClient(server_parameters=mcp_server_parameters)\n```\n\n2. CodeAgent with MCP Tools:\n```python\nagent = CodeAgent(\n    model=InferenceClientModel(model_id=\"Qwen/Qwen3-Next-80B-A3B-Thinking\"),\n    tools=mcp_client.get_tools(),\n)\n```\n\nWhen a user sends a message:\n1. The message is sent to the `/chat` endpoint\n2. The server runs the AI code agent in a separate thread\n3. The agent processes the message using MCP tools\n4. The agent's response is returned to the client and displayed in the chat\n\nThe server also includes a shutdown handler that properly disconnects the MCP client when the server stops:\n```python\nasync def shutdown():\n    mcp_client.disconnect()\n```\n\n## Customization\n\nYou can modify the `CodeAgent` configuration by changing the model or MCP server parameters. For example:\n\n```python\n# Custom MCP server\nmcp_server_parameters = {\n    \"url\": \"your-mcp-server-url\",\n    \"transport\": \"your-transport-method\",\n}\n\n# Custom agent configuration\nagent = CodeAgent(\n    model=InferenceClientModel(model_id=\"your-preferred-model\"),\n    tools=mcp_client.get_tools(),\n)\n```\n"
  },
  {
    "path": "examples/server/main.py",
    "content": "from anyio import to_thread\nfrom starlette.applications import Starlette\nfrom starlette.responses import HTMLResponse, JSONResponse\nfrom starlette.routing import Route\n\nfrom smolagents import CodeAgent, InferenceClientModel, MCPClient\n\n\n# Create an MCP client to connect to the MCP server\nmcp_server_parameters = {\n    \"url\": \"https://evalstate-hf-mcp-server.hf.space/mcp\",\n    \"transport\": \"streamable-http\",\n}\nmcp_client = MCPClient(server_parameters=mcp_server_parameters)\n\n# Create a CodeAgent with a specific model and the tools from the MCP client\nagent = CodeAgent(\n    model=InferenceClientModel(model_id=\"Qwen/Qwen3-Next-80B-A3B-Thinking\"),\n    tools=mcp_client.get_tools(),\n)\n\n\n# Define the shutdown handler to disconnect the MCP client\nasync def shutdown():\n    mcp_client.disconnect()\n\n\nasync def homepage(request):\n    return HTMLResponse(\n        r\"\"\"\n<!DOCTYPE html>\n<html lang=\"en\">\n<head>\n    <meta charset=\"UTF-8\">\n    <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\">\n    <title>Smolagents Demo</title>\n    <style>\n        body {\n            font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;\n            max-width: 800px;\n            margin: 0 auto;\n            padding: 20px;\n            background-color: #f5f5f5;\n        }\n        .container {\n            background: white;\n            border-radius: 12px;\n            padding: 30px;\n            box-shadow: 0 2px 10px rgba(0,0,0,0.1);\n        }\n        h1 {\n            color: #333;\n            text-align: center;\n            margin-bottom: 30px;\n        }\n        .chat-container {\n            border: 1px solid #ddd;\n            border-radius: 8px;\n            height: 400px;\n            overflow-y: auto;\n            padding: 15px;\n            margin-bottom: 20px;\n            background-color: #fafafa;\n        }\n        .message {\n            margin-bottom: 15px;\n            padding: 10px;\n            border-radius: 6px;\n        }\n        .user-message {\n            background-color: #007bff;\n            color: white;\n            margin-left: 50px;\n        }\n        .agent-message {\n            background-color: #e9ecef;\n            color: #333;\n            margin-right: 50px;\n        }\n        .input-container {\n            display: flex;\n            gap: 10px;\n        }\n        input[type=\"text\"] {\n            flex: 1;\n            padding: 12px;\n            border: 1px solid #ddd;\n            border-radius: 6px;\n            font-size: 16px;\n        }\n        button {\n            padding: 12px 24px;\n            background-color: #007bff;\n            color: white;\n            border: none;\n            border-radius: 6px;\n            cursor: pointer;\n            font-size: 16px;\n        }\n        button:hover {\n            background-color: #0056b3;\n        }\n        button:disabled {\n            background-color: #ccc;\n            cursor: not-allowed;\n        }\n        .loading {\n            color: #666;\n            font-style: italic;\n        }\n    </style>\n</head>\n<body>\n    <div class=\"container\">\n        <h1>🤖 Smolagents Demo</h1>\n        <div class=\"chat-container\" id=\"chat-container\">\n            <div class=\"message agent-message\">\n                Hello! I'm a code agent with access to MCP tools. Ask me anything!\n            </div>\n        </div>\n        <div class=\"input-container\">\n            <input type=\"text\" id=\"message-input\" placeholder=\"Ask me anything...\" autofocus>\n            <button onclick=\"sendMessage()\" id=\"send-button\">Send</button>\n        </div>\n    </div>\n\n    <script>\n        const chatContainer = document.getElementById('chat-container');\n        const messageInput = document.getElementById('message-input');\n        const sendButton = document.getElementById('send-button');\n\n        function addMessage(content, isUser = false) {\n            const messageDiv = document.createElement('div');\n            messageDiv.className = `message ${isUser ? 'user-message' : 'agent-message'}`;\n            messageDiv.textContent = content;\n            chatContainer.appendChild(messageDiv);\n            chatContainer.scrollTop = chatContainer.scrollHeight;\n        }\n\n        async function sendMessage() {\n            const message = messageInput.value.trim();\n            if (!message) return;\n\n            // Add user message\n            addMessage(message, true);\n            messageInput.value = '';\n            sendButton.disabled = true;\n            sendButton.textContent = 'Sending...';\n\n            // Add loading indicator\n            const loadingDiv = document.createElement('div');\n            loadingDiv.className = 'message agent-message loading';\n            loadingDiv.textContent = 'Thinking...';\n            chatContainer.appendChild(loadingDiv);\n            chatContainer.scrollTop = chatContainer.scrollHeight;\n\n            try {\n                const response = await fetch('/chat', {\n                    method: 'POST',\n                    headers: {\n                        'Content-Type': 'application/json',\n                    },\n                    body: JSON.stringify({ message }),\n                });\n\n                const data = await response.json();\n\n                // Remove loading indicator\n                chatContainer.removeChild(loadingDiv);\n\n                // Add agent response\n                addMessage(data.reply);\n            } catch (error) {\n                // Remove loading indicator\n                chatContainer.removeChild(loadingDiv);\n                addMessage(`Error: ${error.message}`);\n            } finally {\n                sendButton.disabled = false;\n                sendButton.textContent = 'Send';\n                messageInput.focus();\n            }\n        }\n\n        // Send message on Enter key\n        messageInput.addEventListener('keypress', function(e) {\n            if (e.key === 'Enter') {\n                sendMessage();\n            }\n        });\n    </script>\n</body>\n</html>\n\"\"\"\n    )\n\n\nasync def chat(request):\n    data = await request.json()\n    message = data.get(\"message\", \"\").strip()\n    # Run in a thread to avoid blocking the event loop\n    result = await to_thread.run_sync(agent.run, message)\n    # Format the result if it's a complex data structure\n    reply = str(result)\n    return JSONResponse({\"reply\": reply})\n\n\napp = Starlette(\n    debug=True,\n    routes=[\n        Route(\"/\", homepage),\n        Route(\"/chat\", chat, methods=[\"POST\"]),\n    ],\n    on_shutdown=[shutdown],  # Register the shutdown handler: disconnect the MCP client\n)\n"
  },
  {
    "path": "examples/smolagents_benchmark/run.py",
    "content": "import argparse\nimport datetime\nimport json\nimport os\nimport threading\nimport time\nfrom concurrent.futures import ThreadPoolExecutor, as_completed\nfrom pathlib import Path\n\nimport datasets\nimport pandas as pd\nfrom dotenv import load_dotenv\nfrom tqdm import tqdm\n\nfrom smolagents import (\n    AgentError,\n    CodeAgent,\n    GoogleSearchTool,\n    InferenceClientModel,\n    LiteLLMModel,\n    PythonInterpreterTool,\n    ToolCallingAgent,\n    VisitWebpageTool,\n)\n\n\nload_dotenv()\nos.makedirs(\"output\", exist_ok=True)\n\nAPPEND_ANSWER_LOCK = threading.Lock()\n\n\ndef parse_arguments():\n    parser = argparse.ArgumentParser(description=\"Runs an agent powered by the given model on smolagent benchmark.\")\n    parser.add_argument(\n        \"--date\",\n        type=str,\n        default=None,\n        help=\"The date for the evaluation.\",\n    )\n    parser.add_argument(\n        \"--eval-dataset\",\n        type=str,\n        default=\"smolagents/benchmark-v1\",\n    )\n    # The eval dataset is gated, so you must first visit its page to request access: https://huggingface.co/datasets/smolagents-benchmark/benchmark-v1\n    parser.add_argument(\n        \"--model-type\",\n        type=str,\n        default=\"InferenceClientModel\",\n        choices=[\"LiteLLMModel\", \"InferenceClientModel\"],\n        help=\"The model type to use (LiteLLMModel or InferenceClientModel)\",\n    )\n    parser.add_argument(\n        \"--model-id\",\n        type=str,\n        required=True,\n        help=\"The model ID to use for the specified model type\",\n    )\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        help=\"The provider for InferenceClientModel - will not be used for LiteLLMModel\",\n    )\n    parser.add_argument(\n        \"--agent-action-type\",\n        type=str,\n        default=\"code\",\n        choices=[\"code\", \"tool-calling\", \"vanilla\"],\n        help=\"The agent action type: 'code', 'tool-calling', or 'vanilla' to use the vanilla llm\",\n    )\n    parser.add_argument(\n        \"--parallel-workers\",\n        type=int,\n        default=8,\n        help=\"The number of processes to run in parallel\",\n    )\n    parser.add_argument(\n        \"--push-answers-to-hub\",\n        action=\"store_true\",\n        help=\"Push the answers to the hub\",\n    )\n    parser.add_argument(\n        \"--answers-dataset\",\n        type=str,\n        default=\"smolagents/answers\",\n    )\n    return parser.parse_args()\n\n\ndef load_eval_dataset(eval_dataset):\n    # Choose the tasks to evaluate on:\n    # tasks = [\"gaia\"]\n    # or evaluate on all tasks: [\"gaia\", \"math\", \"simpleqa\"]\n    tasks = datasets.get_dataset_config_names(eval_dataset)\n    print(tasks)\n\n    eval_ds = {task: datasets.load_dataset(eval_dataset, task, split=\"test\") for task in tasks}\n    print(pd.DataFrame(eval_ds[\"simpleqa\"]).head())\n    return eval_ds\n\n\ndef serialize_agent_error(obj):\n    if isinstance(obj, AgentError):\n        return {\"error_type\": obj.__class__.__name__, \"message\": obj.message}\n    else:\n        return str(obj)\n\n\ndef append_answer(entry: dict, jsonl_file: str) -> None:\n    jsonl_file = Path(jsonl_file)\n    jsonl_file.parent.mkdir(parents=True, exist_ok=True)\n\n    def convert_to_serializable(obj):\n        if hasattr(obj, \"dict\"):\n            return obj.dict()\n        else:\n            raise TypeError(f\"Object of type {type(obj)} is not JSON serializable\")\n\n    with APPEND_ANSWER_LOCK, open(jsonl_file, \"a\", encoding=\"utf-8\") as fp:\n        fp.write(json.dumps(entry, default=convert_to_serializable) + \"\\n\")\n    assert os.path.exists(jsonl_file), \"File not found!\"\n\n\ndef answer_single_question(example, model, answers_file, action_type):\n    if action_type == \"vanilla\":\n        agent = model\n    elif action_type == \"code\":\n        agent = CodeAgent(\n            tools=[GoogleSearchTool(provider=\"serper\"), VisitWebpageTool()],\n            model=model,\n            additional_authorized_imports=[\"numpy\", \"sympy\"],\n            max_steps=10,\n        )\n    elif action_type == \"tool-calling\":\n        agent = ToolCallingAgent(\n            tools=[\n                GoogleSearchTool(provider=\"serper\"),\n                VisitWebpageTool(),\n                PythonInterpreterTool(authorized_imports=[\"numpy\", \"sympy\"]),\n            ],\n            model=model,\n            max_steps=10,\n        )\n\n    augmented_question = example[\"question\"]\n    if example[\"source\"] == \"SimpleQA\":\n        augmented_question += \" Answer with only the final number.\"\n    if example[\"source\"] == \"MATH\":\n        augmented_question += \" Write code, not latex.\"\n\n    start_time = time.time()\n\n    try:\n        if action_type == \"vanilla\":\n            answer = agent([{\"role\": \"user\", \"content\": augmented_question}]).content\n            token_counts = agent.monitor.get_total_token_counts()\n            intermediate_steps = answer\n        else:\n            # Run agent 🚀\n            answer = str(agent.run(augmented_question))\n            token_counts = agent.monitor.get_total_token_counts()\n            intermediate_steps = [message.dict() for message in agent.write_memory_to_messages()]\n\n        end_time = time.time()\n    except Exception as e:\n        print(\"Error on \", augmented_question, e)\n        intermediate_steps = []\n        token_counts = {\"input\": 0, \"output\": 0}\n        answer = str(e)\n    end_time = datetime.datetime.now().strftime(\"%Y-%m-%d %H:%M:%S\")\n    annotated_example = {\n        \"model_id\": model.model_id,\n        \"agent_action_type\": action_type,\n        \"question\": augmented_question,\n        \"original_question\": example[\"question\"],\n        \"answer\": answer,\n        \"true_answer\": example[\"true_answer\"],\n        \"source\": example[\"source\"],\n        \"intermediate_steps\": intermediate_steps,\n        \"start_time\": start_time,\n        \"end_time\": end_time,\n        \"token_counts\": token_counts,\n    }\n    append_answer(annotated_example, answers_file)\n\n\ndef answer_questions(\n    eval_ds,\n    model,\n    date,\n    action_type: str = \"code\",\n    output_dir: str = \"output\",\n    answers_dataset: str = None,\n    push_answers_to_hub: bool = False,\n    parallel_workers: int = 32,\n):\n    date = date or datetime.date.today().isoformat()\n    model_id = model.model_id\n\n    for task in eval_ds:\n        file_name = f\"{output_dir}/{model_id.replace('/', '__')}__{action_type}__{task}__{date}.jsonl\"\n        print(f\"Starting processing and writing output to '{file_name}'\")\n        answered_questions = []\n        if os.path.exists(file_name):\n            with open(file_name, \"r\") as f:\n                for line in f:\n                    answered_questions.append(json.loads(line)[\"original_question\"])\n\n        examples_todo = [example for example in eval_ds[task] if example[\"question\"] not in answered_questions]\n        print(f\"Launching {parallel_workers} parallel workers.\")\n\n        with ThreadPoolExecutor(max_workers=parallel_workers) as exe:\n            futures = [\n                exe.submit(answer_single_question, example, model, file_name, action_type) for example in examples_todo\n            ]\n            for f in tqdm(as_completed(futures), total=len(examples_todo), desc=\"Processing tasks\"):\n                f.result()\n\n        print(\"All tasks processed.\")\n\n        if push_answers_to_hub and answers_dataset:\n            print(\"Pushing answers to hub...\")\n            ds = datasets.Dataset.from_pandas(pd.read_json(file_name, lines=True), split=\"test\", preserve_index=False)\n            config = f\"{model_id.replace('/', '__')}__{action_type}__{task}\"\n            data_dir = f\"{model_id}/{action_type}/{task}/{date}\"\n            ds.push_to_hub(\n                answers_dataset,\n                config_name=config,\n                data_dir=data_dir,\n                split=\"test\",\n                commit_message=f\"Upload {config}\",\n            )\n\n\nif __name__ == \"__main__\":\n    args = parse_arguments()\n\n    eval_ds = load_eval_dataset(args.eval_dataset)\n\n    if args.model_type == \"LiteLLMModel\":\n        model = LiteLLMModel(\n            model_id=args.model_id,\n            max_completion_tokens=8192,\n        )\n    else:\n        model = InferenceClientModel(model_id=args.model_id, provider=args.provider, max_tokens=8192)\n\n    answer_questions(\n        eval_ds,\n        model,\n        args.date,\n        action_type=args.agent_action_type,\n        answers_dataset=args.answers_dataset,\n        push_answers_to_hub=args.push_answers_to_hub,\n        parallel_workers=args.parallel_workers,\n    )\n"
  },
  {
    "path": "examples/smolagents_benchmark/score.ipynb",
    "content": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"!pip install -e .. datasets sympy numpy matplotlib seaborn -q  # Install dev version of smolagents + some packages\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 17,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"# Benchmark date\\n\",\n    \"# - set a concrete date:\\n\",\n    \"DATE = \\\"2024-12-26\\\"\\n\",\n    \"# - or use default: today\\n\",\n    \"# DATE = None\\n\",\n    \"\\n\",\n    \"# Evaluation dataset\\n\",\n    \"# - the dataset is gated, so you must first visit its page to request access: https://huggingface.co/datasets/smolagents-benchmark/benchmark-v1\\n\",\n    \"EVAL_DATASET = \\\"smolagents/benchmark-v1\\\"\\n\",\n    \"\\n\",\n    \"# Answers dataset: it must be a gated dataset; required to score the answers\\n\",\n    \"ANSWERS_DATASET = \\\"smolagents/answers\\\"\\n\",\n    \"# Whether to push the answers dataset to the Hub\\n\",\n    \"PUSH_ANSWERS_DATASET_TO_HUB = True\\n\",\n    \"\\n\",\n    \"# Results dataset\\n\",\n    \"RESULTS_DATASET = \\\"smolagents/results\\\"\\n\",\n    \"# Whether to push the results dataset to the Hub\\n\",\n    \"PUSH_RESULTS_DATASET_TO_HUB = True\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Constants and utilities/tools\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import datetime\\n\",\n    \"import re\\n\",\n    \"import string\\n\",\n    \"import warnings\\n\",\n    \"from concurrent.futures import ThreadPoolExecutor, as_completed\\n\",\n    \"\\n\",\n    \"import numpy as np\\n\",\n    \"from tqdm import tqdm\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def normalize_number_str(number_str: str) -> float:\\n\",\n    \"    # we replace these common units and commas to allow\\n\",\n    \"    # conversion to float\\n\",\n    \"    for char in [\\\"$\\\", \\\"%\\\", \\\",\\\"]:\\n\",\n    \"        number_str = number_str.replace(char, \\\"\\\")\\n\",\n    \"    try:\\n\",\n    \"        return float(number_str)\\n\",\n    \"    except ValueError:\\n\",\n    \"        return float(\\\"inf\\\")\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def split_string(\\n\",\n    \"    s: str,\\n\",\n    \"    char_list: list[str] = [\\\",\\\", \\\";\\\"],\\n\",\n    \") -> list[str]:\\n\",\n    \"    pattern = f\\\"[{''.join(char_list)}]\\\"\\n\",\n    \"    return re.split(pattern, s)\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def is_float(element: any) -> bool:\\n\",\n    \"    try:\\n\",\n    \"        float(element)\\n\",\n    \"        return True\\n\",\n    \"    except ValueError:\\n\",\n    \"        return False\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def normalize_str(input_str, remove_punct=True) -> str:\\n\",\n    \"    \\\"\\\"\\\"\\n\",\n    \"    Normalize a string by:\\n\",\n    \"    - Removing all white spaces\\n\",\n    \"    - Optionally removing punctuation (if remove_punct is True)\\n\",\n    \"    - Converting to lowercase\\n\",\n    \"    Parameters:\\n\",\n    \"    - input_str: str, the string to normalize\\n\",\n    \"    - remove_punct: bool, whether to remove punctuation (default: True)\\n\",\n    \"    Returns:\\n\",\n    \"    - str, the normalized string\\n\",\n    \"    \\\"\\\"\\\"\\n\",\n    \"    # Remove all white spaces. Required e.g for seagull vs. sea gull\\n\",\n    \"    no_spaces = re.sub(r\\\"\\\\s\\\", \\\"\\\", input_str)\\n\",\n    \"\\n\",\n    \"    # Remove punctuation, if specified.\\n\",\n    \"    if remove_punct:\\n\",\n    \"        translator = str.maketrans(\\\"\\\", \\\"\\\", string.punctuation)\\n\",\n    \"        return no_spaces.lower().translate(translator)\\n\",\n    \"    else:\\n\",\n    \"        return no_spaces.lower()\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def extract_numbers(text: str) -> list[str]:\\n\",\n    \"    \\\"\\\"\\\"This pattern matches:\\n\",\n    \"    - Optional negative sign\\n\",\n    \"    - Numbers with optional comma thousand separators\\n\",\n    \"    - Optional decimal points with decimal numbers\\n\",\n    \"    \\\"\\\"\\\"\\n\",\n    \"    pattern = r\\\"-?(?:\\\\d{1,3}(?:,\\\\d{3})+|\\\\d+)(?:\\\\.\\\\d+)?\\\"\\n\",\n    \"\\n\",\n    \"    return [el.replace(\\\",\\\", \\\"\\\") for el in re.findall(pattern, text)]\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def get_question_score_gaia(\\n\",\n    \"    model_answer: str,\\n\",\n    \"    ground_truth: str,\\n\",\n    \") -> bool:\\n\",\n    \"    \\\"\\\"\\\"Scoring function used to score functions from the GAIA benchmark\\\"\\\"\\\"\\n\",\n    \"    if is_float(ground_truth):\\n\",\n    \"        normalized_answer = normalize_number_str(str(model_answer))\\n\",\n    \"        return normalized_answer == float(ground_truth)\\n\",\n    \"\\n\",\n    \"    elif any(char in ground_truth for char in [\\\",\\\", \\\";\\\"]):  # if gt is a list\\n\",\n    \"        # question with the fish: normalization removes punct\\n\",\n    \"        gt_elems = split_string(ground_truth)\\n\",\n    \"        ma_elems = split_string(model_answer)\\n\",\n    \"\\n\",\n    \"        if len(gt_elems) != len(ma_elems):  # check length is the same\\n\",\n    \"            warnings.warn(\\\"Answer lists have different lengths, returning False.\\\", UserWarning)\\n\",\n    \"            return False\\n\",\n    \"\\n\",\n    \"        comparisons = []\\n\",\n    \"        for ma_elem, gt_elem in zip(ma_elems, gt_elems):  # compare each element as float or str\\n\",\n    \"            if is_float(gt_elem):\\n\",\n    \"                normalized_ma_elem = normalize_number_str(ma_elem)\\n\",\n    \"                comparisons.append(normalized_ma_elem == float(gt_elem))\\n\",\n    \"            else:\\n\",\n    \"                # we do not remove punct since comparisons can include punct\\n\",\n    \"                comparisons.append(\\n\",\n    \"                    normalize_str(ma_elem, remove_punct=False) == normalize_str(gt_elem, remove_punct=False)\\n\",\n    \"                )\\n\",\n    \"        return all(comparisons)\\n\",\n    \"\\n\",\n    \"    else:  # if gt is a str\\n\",\n    \"        return normalize_str(model_answer) == normalize_str(ground_truth)\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def get_correct(row):\\n\",\n    \"    if row[\\\"source\\\"] == \\\"MATH\\\":  # Checks the last number in answer\\n\",\n    \"        numbers_answer = extract_numbers(str(row[\\\"answer\\\"]))\\n\",\n    \"        if len(numbers_answer) == 0:\\n\",\n    \"            return False\\n\",\n    \"        return np.isclose(float(numbers_answer[-1]), float(row[\\\"true_answer\\\"]), rtol=1e-5, atol=1e-7)\\n\",\n    \"    else:\\n\",\n    \"        return get_question_score_gaia(str(row[\\\"answer\\\"]), str(row[\\\"true_answer\\\"]))\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def score_answers_subset(answers_dataset, answers_subset):\\n\",\n    \"    try:\\n\",\n    \"        print(answers_dataset, answers_subset)\\n\",\n    \"        *model_id, action_type, task = answers_subset.split(\\\"__\\\")\\n\",\n    \"        model_id = \\\"/\\\".join(model_id)\\n\",\n    \"        ds = datasets.load_dataset(answers_dataset, answers_subset, split=\\\"test\\\")\\n\",\n    \"        df = ds.to_pandas()\\n\",\n    \"        df[\\\"correct\\\"] = df.apply(get_correct, axis=1)\\n\",\n    \"        assert df[\\\"correct\\\"].notnull().sum() > 30, \\\"Missing answers\\\"\\n\",\n    \"        acc = df[\\\"correct\\\"].mean().item()\\n\",\n    \"        result = df.loc[0, [\\\"model_id\\\", \\\"agent_action_type\\\", \\\"source\\\"]].to_dict()\\n\",\n    \"        result[\\\"acc\\\"] = acc\\n\",\n    \"        return result\\n\",\n    \"    except Exception as e:\\n\",\n    \"        print(f\\\"Error with {answers_subset}: {e}\\\")\\n\",\n    \"        return None\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def score_answers(\\n\",\n    \"    answers_subsets,\\n\",\n    \"    answers_dataset=ANSWERS_DATASET,\\n\",\n    \"    date=DATE,\\n\",\n    \"    push_to_hub_dataset=RESULTS_DATASET if PUSH_RESULTS_DATASET_TO_HUB else None,\\n\",\n    \"    set_default=True,\\n\",\n    \"):\\n\",\n    \"    \\\"\\\"\\\"\\n\",\n    \"    Score answers from the given dataset subsets.\\n\",\n    \"\\n\",\n    \"    Parameters:\\n\",\n    \"        answers_subsets: List of dataset subsets to score\\n\",\n    \"        answers_dataset: Dataset containing the answers\\n\",\n    \"        date: Date to use for the config name\\n\",\n    \"        push_to_hub_dataset: Dataset ID to push results to, or None to skip pushing\\n\",\n    \"        set_default: If True, sets this config as the default config in the Hugging Face Hub dataset.\\n\",\n    \"                     This means when users load the dataset without specifying a config,\\n\",\n    \"                     this version will be loaded by default.\\n\",\n    \"    \\\"\\\"\\\"\\n\",\n    \"    if not answers_dataset:\\n\",\n    \"        raise ValueError(\\\"Pass 'answers_dataset' to load the answers from it\\\")\\n\",\n    \"    date = date or datetime.date.today().isoformat()\\n\",\n    \"    results = []\\n\",\n    \"    with ThreadPoolExecutor(max_workers=16) as exe:\\n\",\n    \"        futures = [\\n\",\n    \"            exe.submit(score_answers_subset, answers_dataset, answers_subset) for answers_subset in answers_subsets\\n\",\n    \"        ]\\n\",\n    \"        for f in tqdm(as_completed(futures), total=len(answers_subsets), desc=\\\"Processing tasks\\\"):\\n\",\n    \"            result = f.result()\\n\",\n    \"            if result:\\n\",\n    \"                results.append(result)\\n\",\n    \"    df = pd.DataFrame(results)\\n\",\n    \"\\n\",\n    \"    if push_to_hub_dataset:\\n\",\n    \"        ds = datasets.Dataset.from_pandas(df)\\n\",\n    \"        config = date\\n\",\n    \"        ds.push_to_hub(push_to_hub_dataset, config_name=config, commit_message=f\\\"Upload {config} results\\\")\\n\",\n    \"    return df\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Score answers\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import datasets\\n\",\n    \"import pandas as pd\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"# Choose the answers subsets to score:\\n\",\n    \"# answers_subsets = [\\\"meta-llama__Llama-3.1-8B-Instruct__code__gaia\\\"]\\n\",\n    \"# or get all the answers subsets present in the ANSWERS_DATASET\\n\",\n    \"answers_subsets = datasets.get_dataset_config_names(ANSWERS_DATASET)\\n\",\n    \"print(\\\"Number of answers_subsets\\\", len(answers_subsets))\\n\",\n    \"print(\\\"Example of answers_subset\\\", answers_subsets[0])\\n\",\n    \"\\n\",\n    \"result_df = score_answers(answers_subsets)\\n\",\n    \"result_df[\\\"acc\\\"] = (result_df[\\\"acc\\\"] * 100).round(2)\\n\",\n    \"result_df.head()\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 12,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"pivot_df = result_df.pivot_table(\\n\",\n    \"    index=[\\\"model_id\\\", \\\"source\\\"],\\n\",\n    \"    columns=[\\\"agent_action_type\\\"],\\n\",\n    \"    values=\\\"acc\\\",\\n\",\n    \"    fill_value=float(\\\"nan\\\"),\\n\",\n    \").reset_index()\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Display results\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"display(pivot_df)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import matplotlib.pyplot as plt\\n\",\n    \"from matplotlib.legend_handler import HandlerTuple  # Added import\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"# Assuming pivot_df is your original dataframe\\n\",\n    \"models = pivot_df[\\\"model_id\\\"].unique()\\n\",\n    \"sources = pivot_df[\\\"source\\\"].unique()\\n\",\n    \"\\n\",\n    \"# Create figure and axis\\n\",\n    \"plt.style.use(\\\"seaborn-v0_8-white\\\")\\n\",\n    \"fig, ax = plt.subplots(figsize=(15, 6))\\n\",\n    \"\\n\",\n    \"# Set the width of each bar group and positions of the bars\\n\",\n    \"width = 0.15  # width of each bar\\n\",\n    \"spacing = 0.02  # space between bars within a group\\n\",\n    \"group_spacing = 0.2  # space between model groups\\n\",\n    \"\\n\",\n    \"# Calculate positions for the bars\\n\",\n    \"num_sources = len(sources)\\n\",\n    \"total_width_per_group = (width + spacing) * num_sources * 2  # *2 for agent and vanilla\\n\",\n    \"x = np.arange(len(models)) * (total_width_per_group + group_spacing)\\n\",\n    \"\\n\",\n    \"# Plot bars for each source\\n\",\n    \"for i, source in enumerate(sources):\\n\",\n    \"    source_data = pivot_df[pivot_df[\\\"source\\\"] == source]\\n\",\n    \"    agent_scores = [\\n\",\n    \"        source_data[source_data[\\\"model_id\\\"] == model][\\\"code\\\"].values[0]\\n\",\n    \"        if len(source_data[source_data[\\\"model_id\\\"] == model]) > 0\\n\",\n    \"        else np.nan\\n\",\n    \"        for model in models\\n\",\n    \"    ]\\n\",\n    \"    vanilla_scores = [\\n\",\n    \"        source_data[source_data[\\\"model_id\\\"] == model][\\\"vanilla\\\"].values[0]\\n\",\n    \"        if len(source_data[source_data[\\\"model_id\\\"] == model]) > 0\\n\",\n    \"        else np.nan\\n\",\n    \"        for model in models\\n\",\n    \"    ]\\n\",\n    \"\\n\",\n    \"    # Position calculation for each pair of bars\\n\",\n    \"    pos = x + i * (width * 2 + spacing)\\n\",\n    \"\\n\",\n    \"    agent_bars = ax.bar(pos, agent_scores, width, label=f\\\"{source} (Agent)\\\", alpha=0.8)\\n\",\n    \"    vanilla_bars = ax.bar(\\n\",\n    \"        pos + width * 0.6,\\n\",\n    \"        vanilla_scores,\\n\",\n    \"        width,\\n\",\n    \"        hatch=\\\"////\\\",\\n\",\n    \"        alpha=0.5,\\n\",\n    \"        hatch_linewidth=2,\\n\",\n    \"        label=f\\\"{source} (Vanilla)\\\",\\n\",\n    \"        color=\\\"white\\\",\\n\",\n    \"        edgecolor=agent_bars[0].get_facecolor(),\\n\",\n    \"    )\\n\",\n    \"\\n\",\n    \"# Customize the plot\\n\",\n    \"ax.set_ylabel(\\\"Score\\\")\\n\",\n    \"ax.set_title(\\\"Model Performance Comparison\\\")\\n\",\n    \"\\n\",\n    \"# Set x-axis ticks in the middle of each group\\n\",\n    \"group_centers = x + (total_width_per_group - spacing) / 2\\n\",\n    \"ax.set_xticks(group_centers)\\n\",\n    \"\\n\",\n    \"# Wrap long model names to prevent overlap\\n\",\n    \"wrapped_labels = [\\\"\\\\n\\\".join(model.split(\\\"/\\\")) for model in models]\\n\",\n    \"ax.set_xticklabels(wrapped_labels, rotation=0, ha=\\\"center\\\")\\n\",\n    \"\\n\",\n    \"# Modify legend to combine agent and vanilla entries\\n\",\n    \"handles, labels = ax.get_legend_handles_labels()\\n\",\n    \"unique_sources = sources\\n\",\n    \"legend_elements = [\\n\",\n    \"    (handles[i * 2], handles[i * 2 + 1], labels[i * 2].replace(\\\" (Agent)\\\", \\\"\\\")) for i in range(len(unique_sources))\\n\",\n    \"]\\n\",\n    \"custom_legend = ax.legend(\\n\",\n    \"    [(agent_handle, vanilla_handle) for agent_handle, vanilla_handle, _ in legend_elements],\\n\",\n    \"    [label for _, _, label in legend_elements],\\n\",\n    \"    handler_map={tuple: HandlerTuple(ndivide=None)},\\n\",\n    \"    bbox_to_anchor=(1.05, 1),\\n\",\n    \"    loc=\\\"upper left\\\",\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"ax.yaxis.grid(True, linestyle=\\\"--\\\", alpha=0.3)\\n\",\n    \"ax.set_ylim(bottom=0)\\n\",\n    \"plt.tight_layout()\\n\",\n    \"ax.spines[\\\"top\\\"].set_visible(False)\\n\",\n    \"ax.spines[\\\"right\\\"].set_visible(False)\\n\",\n    \"\\n\",\n    \"plt.show()\"\n   ]\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"agents\",\n   \"language\": \"python\",\n   \"name\": \"python3\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.12.0\"\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 2\n}\n"
  },
  {
    "path": "examples/structured_output_tool.py",
    "content": "# How to run with uv:\n#   uv run structured_output_tool.py\n#\n# Modify the smolagents dependency to point to the local smolagents repo or\n# remove `@ file:///<path-to-smolagents>`\n#\n# /// script\n# requires-python = \">=3.10\"\n# dependencies = [\n#   \"smolagents[mcp,litellm] @ file:///<path-to-smolagents>\",\n#   \"pydantic\",\n# ]\n# ///\n\nfrom textwrap import dedent\n\nfrom mcp import StdioServerParameters\n\nfrom smolagents import CodeAgent, InferenceClientModel, LiteLLMModel, MCPClient  # noqa: F401\n\n\ndef weather_server_script() -> str:\n    \"\"\"Return an inline MCP server script that exposes a weather tool.\"\"\"\n    return dedent(\n        '''\n        from pydantic import BaseModel, Field\n        from mcp.server.fastmcp import FastMCP\n\n        mcp = FastMCP(\"Weather Service\")\n\n        class WeatherInfo(BaseModel):\n            location: str = Field(description=\"The location name\")\n            temperature: float = Field(description=\"Temperature in Celsius\")\n            conditions: str = Field(description=\"Weather conditions\")\n            humidity: int = Field(description=\"Humidity percentage\", ge=0, le=100)\n\n        @mcp.tool(\n            name=\"get_weather_info\",\n            description=\"Get weather information for a location as structured data.\",\n        )\n        def get_weather_info(city: str) -> WeatherInfo:\n            \"\"\"Get weather information for a city.\"\"\"\n            return WeatherInfo(\n                location=city,\n                temperature=22.5,\n                conditions=\"partly cloudy\",\n                humidity=65\n            )\n\n        mcp.run()\n        '''\n    )\n\n\ndef main() -> None:\n    # Configure your inference model\n    # model = InferenceClientModel()\n    model = LiteLLMModel(\n        model_id=\"mistral/mistral-small-latest\",\n        # model_id=\"openai/gpt-4o-mini\",\n    )\n\n    # Start the Weather MCP server from an inline script in this same file\n    serverparams = StdioServerParameters(command=\"python\", args=[\"-c\", weather_server_script()])\n\n    # Bridge MCP tools into SmolAgents with structured outputs enabled\n    with MCPClient(\n        serverparams,\n        structured_output=True,\n    ) as tools:\n        agent = CodeAgent(tools=tools, model=model)\n        # Example query that encourages tool use and unit conversion\n        agent.run(\"What is the temperature in Tokyo in Fahrenheit?\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "examples/text_to_sql.py",
    "content": "from sqlalchemy import (\n    Column,\n    Float,\n    Integer,\n    MetaData,\n    String,\n    Table,\n    create_engine,\n    insert,\n    inspect,\n    text,\n)\n\n\nengine = create_engine(\"sqlite:///:memory:\")\nmetadata_obj = MetaData()\n\n# create city SQL table\ntable_name = \"receipts\"\nreceipts = Table(\n    table_name,\n    metadata_obj,\n    Column(\"receipt_id\", Integer, primary_key=True),\n    Column(\"customer_name\", String(16), primary_key=True),\n    Column(\"price\", Float),\n    Column(\"tip\", Float),\n)\nmetadata_obj.create_all(engine)\n\nrows = [\n    {\"receipt_id\": 1, \"customer_name\": \"Alan Payne\", \"price\": 12.06, \"tip\": 1.20},\n    {\"receipt_id\": 2, \"customer_name\": \"Alex Mason\", \"price\": 23.86, \"tip\": 0.24},\n    {\"receipt_id\": 3, \"customer_name\": \"Woodrow Wilson\", \"price\": 53.43, \"tip\": 5.43},\n    {\"receipt_id\": 4, \"customer_name\": \"Margaret James\", \"price\": 21.11, \"tip\": 1.00},\n]\nfor row in rows:\n    stmt = insert(receipts).values(**row)\n    with engine.begin() as connection:\n        cursor = connection.execute(stmt)\n\ninspector = inspect(engine)\ncolumns_info = [(col[\"name\"], col[\"type\"]) for col in inspector.get_columns(\"receipts\")]\n\ntable_description = \"Columns:\\n\" + \"\\n\".join([f\"  - {name}: {col_type}\" for name, col_type in columns_info])\nprint(table_description)\n\nfrom smolagents import tool\n\n\n@tool\ndef sql_engine(query: str) -> str:\n    \"\"\"\n    Allows you to perform SQL queries on the table. Returns a string representation of the result.\n    The table is named 'receipts'. Its description is as follows:\n        Columns:\n        - receipt_id: INTEGER\n        - customer_name: VARCHAR(16)\n        - price: FLOAT\n        - tip: FLOAT\n\n    Args:\n        query: The query to perform. This should be correct SQL.\n    \"\"\"\n    output = \"\"\n    with engine.connect() as con:\n        rows = con.execute(text(query))\n        for row in rows:\n            output += \"\\n\" + str(row)\n    return output\n\n\nfrom smolagents import CodeAgent, InferenceClientModel\n\n\nagent = CodeAgent(\n    tools=[sql_engine],\n    model=InferenceClientModel(model_id=\"meta-llama/Meta-Llama-3.1-8B-Instruct\"),\n)\nagent.run(\"Can you give me the name of the client who got the most expensive receipt?\")\n"
  },
  {
    "path": "pyproject.toml",
    "content": "[build-system]\nrequires = [\"setuptools\"]\nbuild-backend = \"setuptools.build_meta\"\n\n[project]\nname = \"smolagents\"\nversion = \"1.25.0.dev0\"\ndescription = \"🤗 smolagents: a barebones library for agents. Agents write python code to call tools or orchestrate other agents.\"\nauthors = [\n  { name=\"Aymeric Roucher\", email=\"aymeric@hf.co\" },\n]\nreadme = \"README.md\"\nrequires-python = \">=3.10\"\ndependencies = [\n  \"huggingface-hub>=0.31.2\",\n  \"requests>=2.32.3\",\n  \"rich>=13.9.4\",\n  \"jinja2>=3.1.4\",\n  \"pillow>=10.0.1\",\n  # Security fix for CVE-2023-4863: https://pillow.readthedocs.io/en/stable/releasenotes/10.0.1.html\n  \"python-dotenv\",\n]\n\n[project.optional-dependencies]\nbedrock = [\n  \"boto3>=1.36.18\"\n]\nblaxel = [\n  \"blaxel>=0.2.19\",\n  \"websocket-client\",\n]\ntorch = [\n  \"torch\",\n  \"torchvision\",\n  \"numpy>=1.21.2\",\n]\naudio = [\n  \"soundfile\",\n  \"smolagents[torch]\",\n]\ndocker = [\n  \"docker>=7.1.0\",\n  \"websocket-client\",\n]\ne2b = [\n  \"e2b-code-interpreter>=1.0.3\",\n  \"python-dotenv>=1.0.1\",\n]\ngradio = [\n  \"gradio>=5.14.0\",  # Sidebar component GH-797\n]\nlitellm = [\n  \"litellm>=1.60.2\",\n]\nmcp = [\n  \"mcpadapt>=0.1.13\",  # Support structured output\n  \"mcp\",\n]\nmlx-lm = [\n  \"mlx-lm\",\n]\nmodal = [\n  \"modal>=1.1.3\",\n  \"websocket-client\",\n]\nopenai = [\n  \"openai>=1.58.1\"\n]\ntelemetry = [\n  \"arize-phoenix\",\n  \"opentelemetry-sdk\",\n  \"opentelemetry-exporter-otlp\",\n  \"openinference-instrumentation-smolagents>=0.1.15\"  # Use new TokenUsage structure\n]\ntoolkit = [\n  \"ddgs>=9.0.0\",  # DuckDuckGoSearchTool\n  \"markdownify>=0.14.1\",  # VisitWebpageTool\n]\ntransformers = [\n  \"accelerate\",\n  \"transformers>=4.0.0\",\n  \"smolagents[torch]\",\n]\nvision = [\n  \"helium\",\n  \"selenium\",\n]\nvllm = [\n  \"vllm>=0.10.2\",\n  \"torch\"\n]\nall = [\n  \"smolagents[audio,blaxel,docker,e2b,gradio,litellm,mcp,mlx-lm,modal,openai,telemetry,toolkit,transformers,vision,bedrock]\",\n]\nquality = [\n  \"ruff>=0.9.0\",\n]\ntest = [\n  \"ipython>=8.31.0\", # for interactive environment tests\n  \"pandas>=2.2.3\",\n  \"pytest>=8.1.0\",\n  \"pytest-datadir\",\n  \"pytest-timeout\",  # For test_all_docs: @pytest.mark.timeout\n  \"python-dotenv>=1.0.1\", # For test_all_docs\n  \"smolagents[all]\",\n  \"rank-bm25\", # For test_all_docs\n  \"Wikipedia-API>=0.8.1\",\n  \"mlx[cpu]\",  # GH-1588\n]\ndev = [\n  \"smolagents[quality,test]\",\n  \"sqlalchemy\", # for ./examples\n]\n\n[tool.pytest.ini_options]\n# Add the specified `OPTS` to the set of command line arguments as if they had been specified by the user.\naddopts = \"-sv --durations=0\"\n\n[tool.ruff]\nline-length = 119\nlint.ignore = [\n  \"F403\", # undefined-local-with-import-star\n  \"E501\", # line-too-long\n]\nlint.select = [\"E\", \"F\", \"I\", \"W\"]\n\n[tool.ruff.lint.per-file-ignores]\n\"examples/*\" = [\n  \"E402\", # module-import-not-at-top-of-file\n]\n\n[tool.ruff.lint.isort]\nknown-first-party = [\"smolagents\"]\nlines-after-imports = 2\n\n[tool.setuptools.package-data]\n\"smolagents.prompts\" = [\"*.yaml\"]\n\n[project.scripts]\nsmolagent = \"smolagents.cli:main\"\nwebagent = \"smolagents.vision_web_browser:main\"\n"
  },
  {
    "path": "src/smolagents/__init__.py",
    "content": "#!/usr/bin/env python\n# coding=utf-8\n\n# Copyright 2024 The HuggingFace Inc. team. All rights reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n__version__ = \"1.25.0.dev0\"\n\nfrom .agent_types import *  # noqa: I001\nfrom .agents import *  # Above noqa avoids a circular dependency due to cli.py\nfrom .default_tools import *\nfrom .gradio_ui import *\nfrom .local_python_executor import *\nfrom .mcp_client import *\nfrom .memory import *\nfrom .models import *\nfrom .monitoring import *\nfrom .remote_executors import *\nfrom .serialization import *\nfrom .tools import *\nfrom .utils import *\nfrom .cli import *\n"
  },
  {
    "path": "src/smolagents/_function_type_hints_utils.py",
    "content": "#!/usr/bin/env python\n# coding=utf-8\n\n# Copyright 2025 The HuggingFace Inc. team. All rights reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\"\"\"This module contains utilities exclusively taken from `transformers` repository.\n\nSince they are not specific to `transformers` and that `transformers` is an heavy dependencies, those helpers have\nbeen duplicated.\n\nTODO: move them to `huggingface_hub` to avoid code duplication.\n\"\"\"\n\nimport inspect\nimport json\nimport re\nimport types\nfrom collections.abc import Callable\nfrom copy import copy\nfrom typing import (\n    Any,\n    Literal,\n    Union,\n    get_args,\n    get_origin,\n    get_type_hints,\n)\n\n\nIMPORT_TO_PACKAGE_MAPPING = {\n    \"wikipediaapi\": \"wikipedia-api\",\n}\n\n\ndef get_package_name(import_name: str) -> str:\n    \"\"\"\n    Return the package name for a given import name.\n\n    Args:\n        import_name (`str`): Import name to get the package name for.\n\n    Returns:\n        `str`: Package name for the given import name.\n    \"\"\"\n    return IMPORT_TO_PACKAGE_MAPPING.get(import_name, import_name)\n\n\ndef get_imports(code: str) -> list[str]:\n    \"\"\"\n    Extracts all the libraries (not relative imports) that are imported in a code.\n\n    Args:\n        code (`str`): Code text to inspect.\n\n    Returns:\n        `list[str]`: List of all packages required to use the input code.\n    \"\"\"\n    # filter out try/except block so in custom code we can have try/except imports\n    code = re.sub(r\"\\s*try\\s*:.*?except.*?:\", \"\", code, flags=re.DOTALL)\n\n    # filter out imports under is_flash_attn_2_available block for avoid import issues in cpu only environment\n    code = re.sub(\n        r\"if is_flash_attn[a-zA-Z0-9_]+available\\(\\):\\s*(from flash_attn\\s*.*\\s*)+\",\n        \"\",\n        code,\n        flags=re.MULTILINE,\n    )\n\n    # Imports of the form `import xxx` or `import xxx as yyy`\n    imports = re.findall(r\"^\\s*import\\s+(\\S+?)(?:\\s+as\\s+\\S+)?\\s*$\", code, flags=re.MULTILINE)\n    # Imports of the form `from xxx import yyy`\n    imports += re.findall(r\"^\\s*from\\s+(\\S+)\\s+import\", code, flags=re.MULTILINE)\n    # Only keep the top-level module\n    imports = [imp.split(\".\")[0] for imp in imports if not imp.startswith(\".\")]\n    return [get_package_name(import_name) for import_name in set(imports)]\n\n\nclass TypeHintParsingException(Exception):\n    \"\"\"Exception raised for errors in parsing type hints to generate JSON schemas\"\"\"\n\n\nclass DocstringParsingException(Exception):\n    \"\"\"Exception raised for errors in parsing docstrings to generate JSON schemas\"\"\"\n\n\ndef get_json_schema(func: Callable) -> dict:\n    \"\"\"\n    This function generates a JSON schema for a given function, based on its docstring and type hints. This is\n    mostly used for passing lists of tools to a chat template. The JSON schema contains the name and description of\n    the function, as well as the names, types and descriptions for each of its arguments. `get_json_schema()` requires\n    that the function has a docstring, and that each argument has a description in the docstring, in the standard\n    Google docstring format shown below. It also requires that all the function arguments have a valid Python type hint.\n\n    Although it is not required, a `Returns` block can also be added, which will be included in the schema. This is\n    optional because most chat templates ignore the return value of the function.\n\n    Args:\n        func: The function to generate a JSON schema for.\n\n    Returns:\n        A dictionary containing the JSON schema for the function.\n\n    Examples:\n    ```python\n    >>> def multiply(x: float, y: float):\n    >>>    '''\n    >>>    A function that multiplies two numbers\n    >>>\n    >>>    Args:\n    >>>        x: The first number to multiply\n    >>>        y: The second number to multiply\n    >>>    '''\n    >>>    return x * y\n    >>>\n    >>> print(get_json_schema(multiply))\n    {\n        \"name\": \"multiply\",\n        \"description\": \"A function that multiplies two numbers\",\n        \"parameters\": {\n            \"type\": \"object\",\n            \"properties\": {\n                \"x\": {\"type\": \"number\", \"description\": \"The first number to multiply\"},\n                \"y\": {\"type\": \"number\", \"description\": \"The second number to multiply\"}\n            },\n            \"required\": [\"x\", \"y\"]\n        }\n    }\n    ```\n\n    The general use for these schemas is that they are used to generate tool descriptions for chat templates that\n    support them, like so:\n\n    ```python\n    >>> from transformers import AutoTokenizer\n    >>> from transformers.utils import get_json_schema\n    >>>\n    >>> def multiply(x: float, y: float):\n    >>>    '''\n    >>>    A function that multiplies two numbers\n    >>>\n    >>>    Args:\n    >>>        x: The first number to multiply\n    >>>        y: The second number to multiply\n    >>>    return x * y\n    >>>    '''\n    >>>\n    >>> multiply_schema = get_json_schema(multiply)\n    >>> tokenizer = AutoTokenizer.from_pretrained(\"CohereLabs/c4ai-command-r-v01\")\n    >>> messages = [{\"role\": \"user\", \"content\": \"What is 179 x 4571?\"}]\n    >>> formatted_chat = tokenizer.apply_chat_template(\n    >>>     messages,\n    >>>     tools=[multiply_schema],\n    >>>     chat_template=\"tool_use\",\n    >>>     return_dict=True,\n    >>>     return_tensors=\"pt\",\n    >>>     add_generation_prompt=True\n    >>> )\n    >>> # The formatted chat can now be passed to model.generate()\n    ```\n\n    Each argument description can also have an optional `(choices: ...)` block at the end, such as\n    `(choices: [\"tea\", \"coffee\"])`, which will be parsed into an `enum` field in the schema. Note that this will\n    only be parsed correctly if it is at the end of the line:\n\n    ```python\n    >>> def drink_beverage(beverage: str):\n    >>>    '''\n    >>>    A function that drinks a beverage\n    >>>\n    >>>    Args:\n    >>>        beverage: The beverage to drink (choices: [\"tea\", \"coffee\"])\n    >>>    '''\n    >>>    pass\n    >>>\n    >>> print(get_json_schema(drink_beverage))\n    ```\n    {\n        'name': 'drink_beverage',\n        'description': 'A function that drinks a beverage',\n        'parameters': {\n            'type': 'object',\n            'properties': {\n                'beverage': {\n                    'type': 'string',\n                    'enum': ['tea', 'coffee'],\n                    'description': 'The beverage to drink'\n                    }\n                },\n            'required': ['beverage']\n        }\n    }\n    \"\"\"\n    doc = inspect.getdoc(func)\n    if not doc:\n        raise DocstringParsingException(\n            f\"Cannot generate JSON schema for {func.__name__} because it has no docstring!\"\n        )\n    doc = doc.strip()\n    main_doc, param_descriptions, return_doc = _parse_google_format_docstring(doc)\n\n    json_schema = _convert_type_hints_to_json_schema(func)\n    if (return_dict := json_schema[\"properties\"].pop(\"return\", None)) is not None:\n        if return_doc is not None:  # We allow a missing return docstring since most templates ignore it\n            return_dict[\"description\"] = return_doc\n    for arg, schema in json_schema[\"properties\"].items():\n        if arg not in param_descriptions:\n            raise DocstringParsingException(\n                f\"Cannot generate JSON schema for {func.__name__} because the docstring has no description for the argument '{arg}'\"\n            )\n        desc = param_descriptions[arg]\n        enum_choices = re.search(r\"\\(choices:\\s*(.*?)\\)\\s*$\", desc, flags=re.IGNORECASE)\n        if enum_choices:\n            schema[\"enum\"] = [c.strip() for c in json.loads(enum_choices.group(1))]\n            desc = enum_choices.string[: enum_choices.start()].strip()\n        schema[\"description\"] = desc\n\n    output = {\"name\": func.__name__, \"description\": main_doc, \"parameters\": json_schema}\n    if return_dict is not None:\n        output[\"return\"] = return_dict\n    return {\"type\": \"function\", \"function\": output}\n\n\n# Extracts the initial segment of the docstring, containing the function description\ndescription_re = re.compile(r\"^(.*?)(?=\\n\\s*(Args:|Returns:|Raises:)|\\Z)\", re.DOTALL)\n# Extracts the Args: block from the docstring\nargs_re = re.compile(r\"\\n\\s*Args:\\n\\s*(.*?)[\\n\\s]*(Returns:|Raises:|\\Z)\", re.DOTALL)\n# Splits the Args: block into individual arguments\nargs_split_re = re.compile(\n    r\"(?:^|\\n)\"  # Match the start of the args block, or a newline\n    r\"\\s*(\\w+)\\s*(?:\\([^)]*?\\))?:\\s*\"  # Capture the argument name (ignore the type) and strip spacing\n    r\"(.*?)\\s*\"  # Capture the argument description, which can span multiple lines, and strip trailing spacing\n    r\"(?=\\n\\s*\\w+\\s*(?:\\([^)]*?\\))?:|\\Z)\",  # Stop when you hit the next argument (with or without type) or the end of the block\n    re.DOTALL | re.VERBOSE,\n)\n# Extracts the Returns: block from the docstring, if present. Note that most chat templates ignore the return type/doc!\nreturns_re = re.compile(\n    r\"\\n\\s*Returns:\\n\\s*\"\n    r\"(?:[^)]*?:\\s*)?\"  # Ignore the return type if present\n    r\"(.*?)\"  # Capture the return description\n    r\"[\\n\\s]*(Raises:|\\Z)\",\n    re.DOTALL,\n)\n\n\ndef _parse_google_format_docstring(\n    docstring: str,\n) -> tuple[str | None, dict | None, str | None]:\n    \"\"\"\n    Parses a Google-style docstring to extract the function description,\n    argument descriptions, and return description.\n\n    Args:\n        docstring (str): The docstring to parse.\n\n    Returns:\n        The function description, arguments, and return description.\n    \"\"\"\n\n    # Extract the sections\n    description_match = description_re.search(docstring)\n    args_match = args_re.search(docstring)\n    returns_match = returns_re.search(docstring)\n\n    # Clean and store the sections\n    description = description_match.group(1).strip() if description_match else None\n    docstring_args = args_match.group(1).strip() if args_match else None\n    returns = returns_match.group(1).strip() if returns_match else None\n\n    # Parsing the arguments into a dictionary\n    if docstring_args is not None:\n        docstring_args = \"\\n\".join([line for line in docstring_args.split(\"\\n\") if line.strip()])  # Remove blank lines\n        matches = args_split_re.findall(docstring_args)\n        args_dict = {match[0]: re.sub(r\"\\s*\\n+\\s*\", \" \", match[1].strip()) for match in matches}\n    else:\n        args_dict = {}\n\n    return description, args_dict, returns\n\n\ndef _convert_type_hints_to_json_schema(func: Callable, error_on_missing_type_hints: bool = True) -> dict:\n    type_hints = get_type_hints(func)\n    signature = inspect.signature(func)\n\n    properties = {}\n    for param_name, param_type in type_hints.items():\n        properties[param_name] = _parse_type_hint(param_type)\n\n    required = []\n    for param_name, param in signature.parameters.items():\n        if param.annotation == inspect.Parameter.empty and error_on_missing_type_hints:\n            raise TypeHintParsingException(f\"Argument {param.name} is missing a type hint in function {func.__name__}\")\n        if param_name not in properties:\n            properties[param_name] = {}\n\n        if param.default == inspect.Parameter.empty:\n            required.append(param_name)\n        else:\n            properties[param_name][\"nullable\"] = True\n\n    # Return: multi‐type union -> treat as any\n    if (\n        \"return\" in properties\n        and (return_type := properties[\"return\"].get(\"type\"))\n        and not isinstance(return_type, str)\n    ):\n        properties[\"return\"][\"type\"] = \"any\"\n\n    schema = {\"type\": \"object\", \"properties\": properties}\n    if required:\n        schema[\"required\"] = required\n\n    return schema\n\n\ndef _parse_type_hint(hint: type) -> dict:\n    origin = get_origin(hint)\n    args = get_args(hint)\n\n    if origin is None:\n        try:\n            return _get_json_schema_type(hint)\n        except KeyError:\n            raise TypeHintParsingException(\n                \"Couldn't parse this type hint, likely due to a custom class or object: \",\n                hint,\n            )\n\n    elif origin is Union or (hasattr(types, \"UnionType\") and origin is types.UnionType):\n        return _parse_union_type(args)\n\n    elif origin is list:\n        if not args:\n            return {\"type\": \"array\"}\n        else:\n            # Lists can only have a single type argument, so recurse into it\n            return {\"type\": \"array\", \"items\": _parse_type_hint(args[0])}\n\n    elif origin is tuple:\n        if not args:\n            return {\"type\": \"array\"}\n        if len(args) == 1:\n            raise TypeHintParsingException(\n                f\"The type hint {str(hint).replace('typing.', '')} is a Tuple with a single element, which \"\n                \"we do not automatically convert to JSON schema as it is rarely necessary. If this input can contain \"\n                \"more than one element, we recommend \"\n                \"using a List[] type instead, or if it really is a single element, remove the Tuple[] wrapper and just \"\n                \"pass the element directly.\"\n            )\n        if ... in args:\n            raise TypeHintParsingException(\n                \"Conversion of '...' is not supported in Tuple type hints. \"\n                \"Use List[] types for variable-length\"\n                \" inputs instead.\"\n            )\n        return {\"type\": \"array\", \"prefixItems\": [_parse_type_hint(t) for t in args]}\n\n    elif origin is dict:\n        # The JSON equivalent to a dict is 'object', which mandates that all keys are strings\n        # However, we can specify the type of the dict values with \"additionalProperties\"\n        out = {\"type\": \"object\"}\n        if len(args) == 2:\n            out[\"additionalProperties\"] = _parse_type_hint(args[1])\n        return out\n\n    elif origin is Literal:\n        literal_types = set(type(arg) for arg in args)\n        final_type = _parse_union_type(literal_types)\n\n        # None literal value is represented by 'nullable' field set by _parse_union_type\n        final_type.update({\"enum\": [arg for arg in args if arg is not None]})\n        return final_type\n\n    raise TypeHintParsingException(\"Couldn't parse this type hint, likely due to a custom class or object: \", hint)\n\n\ndef _parse_union_type(args: tuple[Any, ...]) -> dict:\n    subtypes = [_parse_type_hint(t) for t in args if t is not type(None)]\n    if len(subtypes) == 1:\n        # A single non-null type can be expressed directly\n        return_dict = subtypes[0]\n    elif all(isinstance(subtype[\"type\"], str) for subtype in subtypes):\n        # A union of basic types can be expressed as a list in the schema\n        return_dict = {\"type\": sorted([subtype[\"type\"] for subtype in subtypes])}\n    else:\n        # A union of more complex types requires \"anyOf\"\n        return_dict = {\"anyOf\": subtypes}\n    if type(None) in args:\n        return_dict[\"nullable\"] = True\n    return return_dict\n\n\n_BASE_TYPE_MAPPING = {\n    int: {\"type\": \"integer\"},\n    float: {\"type\": \"number\"},\n    str: {\"type\": \"string\"},\n    bool: {\"type\": \"boolean\"},\n    list: {\"type\": \"array\"},\n    dict: {\"type\": \"object\"},\n    Any: {\"type\": \"any\"},\n    types.NoneType: {\"type\": \"null\"},\n}\n\n\ndef _get_json_schema_type(param_type: type) -> dict[str, str]:\n    if param_type in _BASE_TYPE_MAPPING:\n        return copy(_BASE_TYPE_MAPPING[param_type])\n    if str(param_type) == \"Image\":\n        from PIL.Image import Image\n\n        if param_type == Image:\n            return {\"type\": \"image\"}\n    if str(param_type) == \"Tensor\":\n        try:\n            from torch import Tensor\n\n            if param_type == Tensor:\n                return {\"type\": \"audio\"}\n        except ModuleNotFoundError:\n            pass\n    return {\"type\": \"object\"}\n"
  },
  {
    "path": "src/smolagents/agent_types.py",
    "content": "# coding=utf-8\n# Copyright 2024 HuggingFace Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\nimport logging\nimport os\nimport pathlib\nimport tempfile\nimport uuid\nfrom io import BytesIO\nfrom typing import Any\n\nimport PIL.Image\nimport requests\n\nfrom .utils import _is_package_available\n\n\nlogger = logging.getLogger(__name__)\n\n\nclass AgentType:\n    \"\"\"\n    Abstract class to be reimplemented to define types that can be returned by agents.\n\n    These objects serve three purposes:\n\n    - They behave as they were the type they're meant to be, e.g., a string for text, a PIL.Image.Image for images\n    - They can be stringified: str(object) in order to return a string defining the object\n    - They should be displayed correctly in ipython notebooks/colab/jupyter\n    \"\"\"\n\n    def __init__(self, value):\n        self._value = value\n\n    def __str__(self):\n        return self.to_string()\n\n    def to_raw(self):\n        logger.error(\n            \"This is a raw AgentType of unknown type. Display in notebooks and string conversion will be unreliable\"\n        )\n        return self._value\n\n    def to_string(self) -> str:\n        logger.error(\n            \"This is a raw AgentType of unknown type. Display in notebooks and string conversion will be unreliable\"\n        )\n        return str(self._value)\n\n\nclass AgentText(AgentType, str):\n    \"\"\"\n    Text type returned by the agent. Behaves as a string.\n    \"\"\"\n\n    def to_raw(self):\n        return self._value\n\n    def to_string(self):\n        return str(self._value)\n\n\nclass AgentImage(AgentType, PIL.Image.Image):\n    \"\"\"\n    Image type returned by the agent. Behaves as a PIL.Image.Image.\n    \"\"\"\n\n    def __init__(self, value):\n        AgentType.__init__(self, value)\n        PIL.Image.Image.__init__(self)\n\n        self._path = None\n        self._raw = None\n        self._tensor = None\n\n        if isinstance(value, AgentImage):\n            self._raw, self._path, self._tensor = value._raw, value._path, value._tensor\n        elif isinstance(value, PIL.Image.Image):\n            self._raw = value\n        elif isinstance(value, bytes):\n            self._raw = PIL.Image.open(BytesIO(value))\n        elif isinstance(value, (str, pathlib.Path)):\n            self._path = value\n        else:\n            try:\n                import torch\n\n                if isinstance(value, torch.Tensor):\n                    self._tensor = value\n                import numpy as np\n\n                if isinstance(value, np.ndarray):\n                    self._tensor = torch.from_numpy(value)\n            except ModuleNotFoundError:\n                pass\n\n        if self._path is None and self._raw is None and self._tensor is None:\n            raise TypeError(f\"Unsupported type for {self.__class__.__name__}: {type(value)}\")\n\n    def _ipython_display_(self, include=None, exclude=None):\n        \"\"\"\n        Displays correctly this type in an ipython notebook (ipython, colab, jupyter, ...)\n        \"\"\"\n        from IPython.display import Image, display\n\n        display(Image(self.to_string()))\n\n    def to_raw(self):\n        \"\"\"\n        Returns the \"raw\" version of that object. In the case of an AgentImage, it is a PIL.Image.Image.\n        \"\"\"\n        if self._raw is not None:\n            return self._raw\n\n        if self._path is not None:\n            self._raw = PIL.Image.open(self._path)\n            return self._raw\n\n        if self._tensor is not None:\n            import numpy as np\n\n            array = self._tensor.cpu().detach().numpy()\n            return PIL.Image.fromarray((255 - array * 255).astype(np.uint8))\n\n    def to_string(self):\n        \"\"\"\n        Returns the stringified version of that object. In the case of an AgentImage, it is a path to the serialized\n        version of the image.\n        \"\"\"\n        if self._path is not None:\n            return self._path\n\n        if self._raw is not None:\n            directory = tempfile.mkdtemp()\n            self._path = os.path.join(directory, str(uuid.uuid4()) + \".png\")\n            self._raw.save(self._path, format=\"png\")\n            return self._path\n\n        if self._tensor is not None:\n            import numpy as np\n\n            array = self._tensor.cpu().detach().numpy()\n\n            # There is likely simpler than load into image into save\n            img = PIL.Image.fromarray((255 - array * 255).astype(np.uint8))\n\n            directory = tempfile.mkdtemp()\n            self._path = os.path.join(directory, str(uuid.uuid4()) + \".png\")\n            img.save(self._path, format=\"png\")\n\n            return self._path\n\n    def save(self, output_bytes, format: str = None, **params):\n        \"\"\"\n        Saves the image to a file.\n        Args:\n            output_bytes (bytes): The output bytes to save the image to.\n            format (str): The format to use for the output image. The format is the same as in PIL.Image.save.\n            **params: Additional parameters to pass to PIL.Image.save.\n        \"\"\"\n        img = self.to_raw()\n        img.save(output_bytes, format=format, **params)\n\n\nclass AgentAudio(AgentType, str):\n    \"\"\"\n    Audio type returned by the agent.\n    \"\"\"\n\n    def __init__(self, value, samplerate=16_000):\n        if not _is_package_available(\"soundfile\") or not _is_package_available(\"torch\"):\n            raise ModuleNotFoundError(\n                \"Please install 'audio' extra to use AgentAudio: `pip install 'smolagents[audio]'`\"\n            )\n        import numpy as np\n        import torch\n\n        super().__init__(value)\n\n        self._path = None\n        self._tensor = None\n\n        self.samplerate = samplerate\n        if isinstance(value, (str, pathlib.Path)):\n            self._path = value\n        elif isinstance(value, torch.Tensor):\n            self._tensor = value\n        elif isinstance(value, tuple):\n            self.samplerate = value[0]\n            if isinstance(value[1], np.ndarray):\n                self._tensor = torch.from_numpy(value[1])\n            else:\n                self._tensor = torch.tensor(value[1])\n        else:\n            raise ValueError(f\"Unsupported audio type: {type(value)}\")\n\n    def _ipython_display_(self, include=None, exclude=None):\n        \"\"\"\n        Displays correctly this type in an ipython notebook (ipython, colab, jupyter, ...)\n        \"\"\"\n        from IPython.display import Audio, display\n\n        display(Audio(self.to_string(), rate=self.samplerate))\n\n    def to_raw(self):\n        \"\"\"\n        Returns the \"raw\" version of that object. It is a `torch.Tensor` object.\n        \"\"\"\n        import soundfile as sf\n\n        if self._tensor is not None:\n            return self._tensor\n\n        import torch\n\n        if self._path is not None:\n            if \"://\" in str(self._path):\n                response = requests.get(self._path)\n                response.raise_for_status()\n                tensor, self.samplerate = sf.read(BytesIO(response.content))\n            else:\n                tensor, self.samplerate = sf.read(self._path)\n            self._tensor = torch.tensor(tensor)\n            return self._tensor\n\n    def to_string(self):\n        \"\"\"\n        Returns the stringified version of that object. In the case of an AgentAudio, it is a path to the serialized\n        version of the audio.\n        \"\"\"\n        import soundfile as sf\n\n        if self._path is not None:\n            return self._path\n\n        if self._tensor is not None:\n            directory = tempfile.mkdtemp()\n            self._path = os.path.join(directory, str(uuid.uuid4()) + \".wav\")\n            sf.write(self._path, self._tensor, samplerate=self.samplerate)\n            return self._path\n\n\n_AGENT_TYPE_MAPPING = {\"string\": AgentText, \"image\": AgentImage, \"audio\": AgentAudio}\n\n\ndef handle_agent_input_types(*args, **kwargs):\n    args = [(arg.to_raw() if isinstance(arg, AgentType) else arg) for arg in args]\n    kwargs = {k: (v.to_raw() if isinstance(v, AgentType) else v) for k, v in kwargs.items()}\n    return args, kwargs\n\n\ndef handle_agent_output_types(output: Any, output_type: str | None = None) -> Any:\n    if output_type in _AGENT_TYPE_MAPPING:\n        # If the class has defined outputs, we can map directly according to the class definition\n        decoded_outputs = _AGENT_TYPE_MAPPING[output_type](output)\n        return decoded_outputs\n\n    # If the class does not have defined output, then we map according to the type\n    if isinstance(output, str):\n        return AgentText(output)\n    if isinstance(output, PIL.Image.Image):\n        return AgentImage(output)\n    try:\n        import torch\n\n        if isinstance(output, torch.Tensor):\n            return AgentAudio(output)\n    except ModuleNotFoundError:\n        pass\n    return output\n\n\n__all__ = [\"AgentType\", \"AgentImage\", \"AgentText\", \"AgentAudio\"]\n"
  },
  {
    "path": "src/smolagents/agents.py",
    "content": "#!/usr/bin/env python\n# coding=utf-8\n\n# Copyright 2024 The HuggingFace Inc. team. All rights reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\nimport importlib\nimport json\nimport os\nimport tempfile\nimport textwrap\nimport time\nimport warnings\nfrom abc import ABC, abstractmethod\nfrom collections.abc import Callable, Generator\nfrom concurrent.futures import ThreadPoolExecutor, as_completed\nfrom contextvars import copy_context\nfrom dataclasses import dataclass\nfrom logging import getLogger\nfrom pathlib import Path\nfrom typing import TYPE_CHECKING, Any, Literal, Type, TypeAlias, TypedDict, Union\n\nimport yaml\nfrom huggingface_hub import create_repo, metadata_update, snapshot_download, upload_folder\nfrom jinja2 import StrictUndefined, Template\nfrom rich.console import Group\nfrom rich.live import Live\nfrom rich.markdown import Markdown\nfrom rich.panel import Panel\nfrom rich.rule import Rule\nfrom rich.text import Text\n\n\nif TYPE_CHECKING:\n    import PIL.Image\n\nfrom .agent_types import AgentAudio, AgentImage, handle_agent_output_types\nfrom .default_tools import TOOL_MAPPING, FinalAnswerTool\nfrom .local_python_executor import BASE_BUILTIN_MODULES, LocalPythonExecutor, PythonExecutor, fix_final_answer_code\nfrom .memory import (\n    ActionStep,\n    AgentMemory,\n    CallbackRegistry,\n    FinalAnswerStep,\n    MemoryStep,\n    PlanningStep,\n    SystemPromptStep,\n    TaskStep,\n    Timing,\n    ToolCall,\n)\nfrom .models import (\n    CODEAGENT_RESPONSE_FORMAT,\n    MODEL_REGISTRY,\n    ChatMessage,\n    ChatMessageStreamDelta,\n    ChatMessageToolCall,\n    MessageRole,\n    Model,\n    agglomerate_stream_deltas,\n    parse_json_if_needed,\n)\nfrom .monitoring import (\n    YELLOW_HEX,\n    AgentLogger,\n    LogLevel,\n    Monitor,\n    TokenUsage,\n)\nfrom .remote_executors import BlaxelExecutor, DockerExecutor, E2BExecutor, ModalExecutor, WasmExecutor\nfrom .tools import BaseTool, Tool, validate_tool_arguments\nfrom .utils import (\n    AgentError,\n    AgentExecutionError,\n    AgentGenerationError,\n    AgentMaxStepsError,\n    AgentParsingError,\n    AgentToolCallError,\n    AgentToolExecutionError,\n    create_agent_gradio_app_template,\n    extract_code_from_text,\n    is_valid_name,\n    make_init_file,\n    parse_code_blobs,\n    truncate_content,\n)\n\n\nlogger = getLogger(__name__)\n\n\ndef populate_template(template: str, variables: dict[str, Any]) -> str:\n    compiled_template = Template(template, undefined=StrictUndefined)\n    try:\n        return compiled_template.render(**variables)\n    except Exception as e:\n        raise Exception(f\"Error during jinja template rendering: {type(e).__name__}: {e}\")\n\n\n@dataclass\nclass ActionOutput:\n    output: Any\n    is_final_answer: bool\n\n\n@dataclass\nclass ToolOutput:\n    id: str\n    output: Any\n    is_final_answer: bool\n    observation: str\n    tool_call: ToolCall\n\n\nclass PlanningPromptTemplate(TypedDict):\n    \"\"\"\n    Prompt templates for the planning step.\n\n    Args:\n        plan (`str`): Initial plan prompt.\n        update_plan_pre_messages (`str`): Update plan pre-messages prompt.\n        update_plan_post_messages (`str`): Update plan post-messages prompt.\n    \"\"\"\n\n    initial_plan: str\n    update_plan_pre_messages: str\n    update_plan_post_messages: str\n\n\nclass ManagedAgentPromptTemplate(TypedDict):\n    \"\"\"\n    Prompt templates for the managed agent.\n\n    Args:\n        task (`str`): Task prompt.\n        report (`str`): Report prompt.\n    \"\"\"\n\n    task: str\n    report: str\n\n\nclass FinalAnswerPromptTemplate(TypedDict):\n    \"\"\"\n    Prompt templates for the final answer.\n\n    Args:\n        pre_messages (`str`): Pre-messages prompt.\n        post_messages (`str`): Post-messages prompt.\n    \"\"\"\n\n    pre_messages: str\n    post_messages: str\n\n\nclass PromptTemplates(TypedDict):\n    \"\"\"\n    Prompt templates for the agent.\n\n    Args:\n        system_prompt (`str`): System prompt.\n        planning ([`~agents.PlanningPromptTemplate`]): Planning prompt templates.\n        managed_agent ([`~agents.ManagedAgentPromptTemplate`]): Managed agent prompt templates.\n        final_answer ([`~agents.FinalAnswerPromptTemplate`]): Final answer prompt templates.\n    \"\"\"\n\n    system_prompt: str\n    planning: PlanningPromptTemplate\n    managed_agent: ManagedAgentPromptTemplate\n    final_answer: FinalAnswerPromptTemplate\n\n\nEMPTY_PROMPT_TEMPLATES = PromptTemplates(\n    system_prompt=\"\",\n    planning=PlanningPromptTemplate(\n        initial_plan=\"\",\n        update_plan_pre_messages=\"\",\n        update_plan_post_messages=\"\",\n    ),\n    managed_agent=ManagedAgentPromptTemplate(task=\"\", report=\"\"),\n    final_answer=FinalAnswerPromptTemplate(pre_messages=\"\", post_messages=\"\"),\n)\n\n\n@dataclass\nclass RunResult:\n    \"\"\"Holds extended information about an agent run.\n\n    Attributes:\n        output (Any | None): The final output of the agent run, if available.\n        state (Literal[\"success\", \"max_steps_error\"]): The final state of the agent after the run.\n        steps (list[dict]): The agent's memory, as a list of steps.\n        token_usage (TokenUsage | None): Count of tokens used during the run.\n        timing (Timing): Timing details of the agent run: start time, end time, duration.\n        messages (list[dict]): The agent's memory, as a list of messages.\n            <Deprecated version=\"1.22.0\">\n            Parameter 'messages' is deprecated and will be removed in version 1.25. Please use 'steps' instead.\n            </Deprecated>\n    \"\"\"\n\n    output: Any | None\n    state: Literal[\"success\", \"max_steps_error\"]\n    steps: list[dict]\n    token_usage: TokenUsage | None\n    timing: Timing\n\n    def __init__(self, output=None, state=None, steps=None, token_usage=None, timing=None, messages=None):\n        # Handle deprecated 'messages' parameter\n        if messages is not None:\n            if steps is not None:\n                raise ValueError(\"Cannot specify both 'messages' and 'steps' parameters. Use 'steps' instead.\")\n            warnings.warn(\n                \"Parameter 'messages' is deprecated and will be removed in version 1.25. Please use 'steps' instead.\",\n                FutureWarning,\n                stacklevel=2,\n            )\n            steps = messages\n\n        # Initialize with dataclass fields\n        self.output = output\n        self.state = state\n        self.steps = steps\n        self.token_usage = token_usage\n        self.timing = timing\n\n    @property\n    def messages(self):\n        \"\"\"Backward compatibility property that returns steps.\"\"\"\n        warnings.warn(\n            \"Parameter 'messages' is deprecated and will be removed in version 1.25. Please use 'steps' instead.\",\n            FutureWarning,\n            stacklevel=2,\n        )\n        return self.steps\n\n    def dict(self):\n        return {\n            \"output\": self.output,\n            \"state\": self.state,\n            \"steps\": self.steps,\n            \"token_usage\": self.token_usage.dict() if self.token_usage is not None else None,\n            \"timing\": self.timing.dict(),\n        }\n\n\nStreamEvent: TypeAlias = Union[\n    ChatMessageStreamDelta,\n    ChatMessageToolCall,\n    ActionOutput,\n    ToolCall,\n    ToolOutput,\n    PlanningStep,\n    ActionStep,\n    FinalAnswerStep,\n]\n\n\nclass MultiStepAgent(ABC):\n    \"\"\"\n    Agent class that solves the given task step by step, using the ReAct framework:\n    While the objective is not reached, the agent will perform a cycle of action (given by the LLM) and observation (obtained from the environment).\n\n    Args:\n        tools (`list[Tool]`): [`Tool`]s that the agent can use.\n        model (`Callable[[list[dict[str, str]]], ChatMessage]`): Model that will generate the agent's actions.\n        prompt_templates ([`~agents.PromptTemplates`], *optional*): Prompt templates.\n        instructions (`str`, *optional*): Custom instructions for the agent, will be inserted in the system prompt.\n        max_steps (`int`, default `20`): Maximum number of steps the agent can take to solve the task.\n        add_base_tools (`bool`, default `False`): Whether to add the base tools to the agent's tools.\n        verbosity_level (`LogLevel`, default `LogLevel.INFO`): Level of verbosity of the agent's logs.\n        managed_agents (`list`, *optional*): Managed agents that the agent can call.\n        step_callbacks (`list[Callable]` | `dict[Type[MemoryStep], Callable | list[Callable]]`, *optional*): Callbacks that will be called at each step.\n        planning_interval (`int`, *optional*): Interval at which the agent will run a planning step.\n        name (`str`, *optional*): Necessary for a managed agent only - the name by which this agent can be called.\n        description (`str`, *optional*): Necessary for a managed agent only - the description of this agent.\n        provide_run_summary (`bool`, *optional*): Whether to provide a run summary when called as a managed agent.\n        final_answer_checks (`list[Callable]`, *optional*): List of validation functions to run before accepting a final answer.\n            Each function should:\n            - Take the final answer, the agent's memory, and the agent itself as arguments.\n            - Return a boolean indicating whether the final answer is valid.\n        return_full_result (`bool`, default `False`): Whether to return the full [`RunResult`] object or just the final answer output from the agent run.\n    \"\"\"\n\n    def __init__(\n        self,\n        tools: list[Tool],\n        model: Model,\n        prompt_templates: PromptTemplates | None = None,\n        instructions: str | None = None,\n        max_steps: int = 20,\n        add_base_tools: bool = False,\n        verbosity_level: LogLevel = LogLevel.INFO,\n        managed_agents: list | None = None,\n        step_callbacks: list[Callable] | dict[Type[MemoryStep], Callable | list[Callable]] | None = None,\n        planning_interval: int | None = None,\n        name: str | None = None,\n        description: str | None = None,\n        provide_run_summary: bool = False,\n        final_answer_checks: list[Callable] | None = None,\n        return_full_result: bool = False,\n        logger: AgentLogger | None = None,\n    ):\n        self.agent_name = self.__class__.__name__\n        self.model = model\n        self.prompt_templates = prompt_templates or EMPTY_PROMPT_TEMPLATES\n        if prompt_templates is not None:\n            missing_keys = set(EMPTY_PROMPT_TEMPLATES.keys()) - set(prompt_templates.keys())\n            assert not missing_keys, (\n                f\"Some prompt templates are missing from your custom `prompt_templates`: {missing_keys}\"\n            )\n            for key, value in EMPTY_PROMPT_TEMPLATES.items():\n                if isinstance(value, dict):\n                    for subkey in value.keys():\n                        assert key in prompt_templates.keys() and (subkey in prompt_templates[key].keys()), (\n                            f\"Some prompt templates are missing from your custom `prompt_templates`: {subkey} under {key}\"\n                        )\n\n        self.max_steps = max_steps\n        self.step_number = 0\n        self.planning_interval = planning_interval\n        self.state: dict[str, Any] = {}\n        self.name = self._validate_name(name)\n        self.description = description\n        self.provide_run_summary = provide_run_summary\n        self.final_answer_checks = final_answer_checks if final_answer_checks is not None else []\n        self.return_full_result = return_full_result\n        self.instructions = instructions\n        self._setup_managed_agents(managed_agents)\n        self._setup_tools(tools, add_base_tools)\n        self._validate_tools_and_managed_agents(tools, managed_agents)\n\n        self.task: str | None = None\n        self.memory = AgentMemory(self.system_prompt)\n\n        if logger is None:\n            self.logger = AgentLogger(level=verbosity_level)\n        else:\n            self.logger = logger\n\n        self.monitor = Monitor(self.model, self.logger)\n        self._setup_step_callbacks(step_callbacks)\n        self.stream_outputs = False\n\n    @property\n    def system_prompt(self) -> str:\n        return self.initialize_system_prompt()\n\n    @system_prompt.setter\n    def system_prompt(self, value: str):\n        raise AttributeError(\n            \"\"\"The 'system_prompt' property is read-only. Use 'self.prompt_templates[\"system_prompt\"]' instead.\"\"\"\n        )\n\n    def _validate_name(self, name: str | None) -> str | None:\n        if name is not None and not is_valid_name(name):\n            raise ValueError(f\"Agent name '{name}' must be a valid Python identifier and not a reserved keyword.\")\n        return name\n\n    def _setup_managed_agents(self, managed_agents: list | None = None) -> None:\n        \"\"\"Setup managed agents with proper logging.\"\"\"\n        self.managed_agents = {}\n        if managed_agents:\n            assert all(agent.name and agent.description for agent in managed_agents), (\n                \"All managed agents need both a name and a description!\"\n            )\n            self.managed_agents = {agent.name: agent for agent in managed_agents}\n            # Ensure managed agents can be called as tools by the model: set their inputs and output_type\n            for agent in self.managed_agents.values():\n                agent.inputs = {\n                    \"task\": {\"type\": \"string\", \"description\": \"Long detailed description of the task.\"},\n                    \"additional_args\": {\n                        \"type\": \"object\",\n                        \"description\": \"Dictionary of extra inputs to pass to the managed agent, e.g. images, dataframes, or any other contextual data it may need.\",\n                        \"nullable\": True,\n                    },\n                }\n                agent.output_type = \"string\"\n\n    def _setup_tools(self, tools, add_base_tools):\n        assert all(isinstance(tool, BaseTool) for tool in tools), (\n            \"All elements must be instance of BaseTool (or a subclass)\"\n        )\n        self.tools = {tool.name: tool for tool in tools}\n        if add_base_tools:\n            self.tools.update(\n                {\n                    name: cls()\n                    for name, cls in TOOL_MAPPING.items()\n                    if name != \"python_interpreter\" or self.__class__.__name__ == \"ToolCallingAgent\"\n                }\n            )\n        self.tools.setdefault(\"final_answer\", FinalAnswerTool())\n\n    def _validate_tools_and_managed_agents(self, tools, managed_agents):\n        tool_and_managed_agent_names = [tool.name for tool in tools]\n        if managed_agents is not None:\n            tool_and_managed_agent_names += [agent.name for agent in managed_agents]\n        if self.name:\n            tool_and_managed_agent_names.append(self.name)\n        if len(tool_and_managed_agent_names) != len(set(tool_and_managed_agent_names)):\n            raise ValueError(\n                \"Each tool or managed_agent should have a unique name! You passed these duplicate names: \"\n                f\"{[name for name in tool_and_managed_agent_names if tool_and_managed_agent_names.count(name) > 1]}\"\n            )\n\n    def _setup_step_callbacks(self, step_callbacks):\n        # Initialize step callbacks registry\n        self.step_callbacks = CallbackRegistry()\n        if step_callbacks:\n            # Register callbacks list only for ActionStep for backward compatibility\n            if isinstance(step_callbacks, list):\n                for callback in step_callbacks:\n                    self.step_callbacks.register(ActionStep, callback)\n            # Register callbacks dict for specific step classes\n            elif isinstance(step_callbacks, dict):\n                for step_cls, callbacks in step_callbacks.items():\n                    if not isinstance(callbacks, list):\n                        callbacks = [callbacks]\n                    for callback in callbacks:\n                        self.step_callbacks.register(step_cls, callback)\n            else:\n                raise ValueError(\"step_callbacks must be a list or a dict\")\n        # Register monitor update_metrics only for ActionStep for backward compatibility\n        self.step_callbacks.register(ActionStep, self.monitor.update_metrics)\n\n    def run(\n        self,\n        task: str,\n        stream: bool = False,\n        reset: bool = True,\n        images: list[\"PIL.Image.Image\"] | None = None,\n        additional_args: dict | None = None,\n        max_steps: int | None = None,\n        return_full_result: bool | None = None,\n    ) -> Any | RunResult:\n        \"\"\"\n        Run the agent for the given task.\n\n        Args:\n            task (`str`): Task to perform.\n            stream (`bool`): Whether to run in streaming mode.\n                If `True`, returns a generator that yields each step as it is executed. You must iterate over this generator to process the individual steps (e.g., using a for loop or `next()`).\n                If `False`, executes all steps internally and returns only the final answer after completion.\n            reset (`bool`): Whether to reset the conversation or keep it going from previous run.\n            images (`list[PIL.Image.Image]`, *optional*): Image(s) objects.\n            additional_args (`dict`, *optional*): Any other variables that you want to pass to the agent run, for instance images or dataframes. Give them clear names!\n            max_steps (`int`, *optional*): Maximum number of steps the agent can take to solve the task. if not provided, will use the agent's default value.\n            return_full_result (`bool`, *optional*): Whether to return the full [`RunResult`] object or just the final answer output.\n                If `None` (default), the agent's `self.return_full_result` setting is used.\n\n        Example:\n        ```py\n        from smolagents import CodeAgent\n        agent = CodeAgent(tools=[])\n        agent.run(\"What is the result of 2 power 3.7384?\")\n        ```\n        \"\"\"\n        max_steps = max_steps or self.max_steps\n        self.task = task\n        self.interrupt_switch = False\n        if additional_args:\n            self.state.update(additional_args)\n            self.task += f\"\"\"\nYou have been provided with these additional arguments, that you can access directly using the keys as variables:\n{str(additional_args)}.\"\"\"\n\n        self.memory.system_prompt = SystemPromptStep(system_prompt=self.system_prompt)\n        if reset:\n            self.memory.reset()\n            self.monitor.reset()\n\n        self.logger.log_task(\n            content=self.task.strip(),\n            subtitle=f\"{type(self.model).__name__} - {(self.model.model_id if hasattr(self.model, 'model_id') else '')}\",\n            level=LogLevel.INFO,\n            title=self.name if hasattr(self, \"name\") else None,\n        )\n        self.memory.steps.append(TaskStep(task=self.task, task_images=images))\n\n        if getattr(self, \"python_executor\", None):\n            self.python_executor.send_variables(variables=self.state)\n            self.python_executor.send_tools({**self.tools, **self.managed_agents})\n\n        if stream:\n            # The steps are returned as they are executed through a generator to iterate on.\n            return self._run_stream(task=self.task, max_steps=max_steps, images=images)\n\n        run_start_time = time.time()\n        steps = list(self._run_stream(task=self.task, max_steps=max_steps, images=images))\n\n        # Outputs are returned only at the end. We only look at the last step.\n        assert isinstance(steps[-1], FinalAnswerStep)\n        output = steps[-1].output\n\n        return_full_result = return_full_result if return_full_result is not None else self.return_full_result\n        if return_full_result:\n            total_input_tokens = 0\n            total_output_tokens = 0\n            correct_token_usage = True\n            for step in self.memory.steps:\n                if isinstance(step, (ActionStep, PlanningStep)):\n                    if step.token_usage is None:\n                        correct_token_usage = False\n                        break\n                    else:\n                        total_input_tokens += step.token_usage.input_tokens\n                        total_output_tokens += step.token_usage.output_tokens\n            if correct_token_usage:\n                token_usage = TokenUsage(input_tokens=total_input_tokens, output_tokens=total_output_tokens)\n            else:\n                token_usage = None\n\n            if self.memory.steps and isinstance(getattr(self.memory.steps[-1], \"error\", None), AgentMaxStepsError):\n                state = \"max_steps_error\"\n            else:\n                state = \"success\"\n\n            step_dicts = self.memory.get_full_steps()\n\n            return RunResult(\n                output=output,\n                token_usage=token_usage,\n                steps=step_dicts,\n                timing=Timing(start_time=run_start_time, end_time=time.time()),\n                state=state,\n            )\n\n        return output\n\n    def _run_stream(\n        self, task: str, max_steps: int, images: list[\"PIL.Image.Image\"] | None = None\n    ) -> Generator[ActionStep | PlanningStep | FinalAnswerStep | ChatMessageStreamDelta]:\n        self.step_number = 1\n        returned_final_answer = False\n        while not returned_final_answer and self.step_number <= max_steps:\n            if self.interrupt_switch:\n                raise AgentError(\"Agent interrupted.\", self.logger)\n\n            # Run a planning step if scheduled\n            if self.planning_interval is not None and (\n                self.step_number == 1 or (self.step_number - 1) % self.planning_interval == 0\n            ):\n                planning_start_time = time.time()\n                planning_step = None\n                for element in self._generate_planning_step(\n                    task, is_first_step=len(self.memory.steps) == 1, step=self.step_number\n                ):  # Don't use the attribute step_number here, because there can be steps from previous runs\n                    yield element\n                    planning_step = element\n                assert isinstance(planning_step, PlanningStep)  # Last yielded element should be a PlanningStep\n                planning_end_time = time.time()\n                planning_step.timing = Timing(\n                    start_time=planning_start_time,\n                    end_time=planning_end_time,\n                )\n                self._finalize_step(planning_step)\n                self.memory.steps.append(planning_step)\n\n            # Start action step!\n            action_step_start_time = time.time()\n            action_step = ActionStep(\n                step_number=self.step_number,\n                timing=Timing(start_time=action_step_start_time),\n                observations_images=images,\n            )\n            self.logger.log_rule(f\"Step {self.step_number}\", level=LogLevel.INFO)\n            try:\n                for output in self._step_stream(action_step):\n                    # Yield all\n                    yield output\n\n                    if isinstance(output, ActionOutput) and output.is_final_answer:\n                        final_answer = output.output\n                        self.logger.log(\n                            Text(f\"Final answer: {final_answer}\", style=f\"bold {YELLOW_HEX}\"),\n                            level=LogLevel.INFO,\n                        )\n\n                        if self.final_answer_checks:\n                            self._validate_final_answer(final_answer)\n                        returned_final_answer = True\n                        action_step.is_final_answer = True\n\n            except AgentGenerationError as e:\n                # Agent generation errors are not caused by a Model error but an implementation error: so we should raise them and exit.\n                raise e\n            except AgentError as e:\n                # Other AgentError types are caused by the Model, so we should log them and iterate.\n                action_step.error = e\n            finally:\n                self._finalize_step(action_step)\n                self.memory.steps.append(action_step)\n                yield action_step\n                self.step_number += 1\n\n        if not returned_final_answer and self.step_number == max_steps + 1:\n            final_answer = self._handle_max_steps_reached(task)\n            yield action_step\n        final_answer_step = FinalAnswerStep(handle_agent_output_types(final_answer))\n        self._finalize_step(final_answer_step)\n        yield final_answer_step\n\n    def _validate_final_answer(self, final_answer: Any):\n        for check_function in self.final_answer_checks:\n            try:\n                assert check_function(final_answer, self.memory, agent=self)\n            except Exception as e:\n                raise AgentError(f\"Check {check_function.__name__} failed with error: {e}\", self.logger)\n\n    def _finalize_step(self, memory_step: ActionStep | PlanningStep | FinalAnswerStep):\n        if not isinstance(memory_step, FinalAnswerStep):\n            memory_step.timing.end_time = time.time()\n        self.step_callbacks.callback(memory_step, agent=self)\n\n    def _handle_max_steps_reached(self, task: str) -> Any:\n        action_step_start_time = time.time()\n        final_answer = self.provide_final_answer(task)\n        final_memory_step = ActionStep(\n            step_number=self.step_number,\n            error=AgentMaxStepsError(\"Reached max steps.\", self.logger),\n            timing=Timing(start_time=action_step_start_time, end_time=time.time()),\n            token_usage=final_answer.token_usage,\n        )\n        final_memory_step.action_output = final_answer.content\n        self._finalize_step(final_memory_step)\n        self.memory.steps.append(final_memory_step)\n        return final_answer.content\n\n    def _generate_planning_step(\n        self, task, is_first_step: bool, step: int\n    ) -> Generator[ChatMessageStreamDelta | PlanningStep]:\n        start_time = time.time()\n        if is_first_step:\n            input_messages = [\n                ChatMessage(\n                    role=MessageRole.USER,\n                    content=[\n                        {\n                            \"type\": \"text\",\n                            \"text\": populate_template(\n                                self.prompt_templates[\"planning\"][\"initial_plan\"],\n                                variables={\"task\": task, \"tools\": self.tools, \"managed_agents\": self.managed_agents},\n                            ),\n                        }\n                    ],\n                )\n            ]\n            if self.stream_outputs and hasattr(self.model, \"generate_stream\"):\n                plan_message_content = \"\"\n                output_stream = self.model.generate_stream(input_messages, stop_sequences=[\"<end_plan>\"])  # type: ignore\n                input_tokens, output_tokens = 0, 0\n                with Live(\"\", console=self.logger.console, vertical_overflow=\"visible\") as live:\n                    for event in output_stream:\n                        if event.content is not None:\n                            plan_message_content += event.content\n                            live.update(Markdown(plan_message_content))\n                            if event.token_usage:\n                                input_tokens = event.token_usage.input_tokens\n                                output_tokens += event.token_usage.output_tokens\n                        yield event\n            else:\n                plan_message = self.model.generate(input_messages, stop_sequences=[\"<end_plan>\"])\n                plan_message_content = plan_message.content\n                input_tokens, output_tokens = 0, 0\n                if plan_message.token_usage:\n                    input_tokens = plan_message.token_usage.input_tokens\n                    output_tokens = plan_message.token_usage.output_tokens\n            plan = textwrap.dedent(\n                f\"\"\"Here are the facts I know and the plan of action that I will follow to solve the task:\\n```\\n{plan_message_content}\\n```\"\"\"\n            )\n        else:\n            # Summary mode removes the system prompt and previous planning messages output by the model.\n            # Removing previous planning messages avoids influencing too much the new plan.\n            memory_messages = self.write_memory_to_messages(summary_mode=True)\n            plan_update_pre = ChatMessage(\n                role=MessageRole.SYSTEM,\n                content=[\n                    {\n                        \"type\": \"text\",\n                        \"text\": populate_template(\n                            self.prompt_templates[\"planning\"][\"update_plan_pre_messages\"], variables={\"task\": task}\n                        ),\n                    }\n                ],\n            )\n            plan_update_post = ChatMessage(\n                role=MessageRole.USER,\n                content=[\n                    {\n                        \"type\": \"text\",\n                        \"text\": populate_template(\n                            self.prompt_templates[\"planning\"][\"update_plan_post_messages\"],\n                            variables={\n                                \"task\": task,\n                                \"tools\": self.tools,\n                                \"managed_agents\": self.managed_agents,\n                                \"remaining_steps\": (self.max_steps - step),\n                            },\n                        ),\n                    }\n                ],\n            )\n            input_messages = [plan_update_pre] + memory_messages + [plan_update_post]\n            if self.stream_outputs and hasattr(self.model, \"generate_stream\"):\n                plan_message_content = \"\"\n                input_tokens, output_tokens = 0, 0\n                with Live(\"\", console=self.logger.console, vertical_overflow=\"visible\") as live:\n                    for event in self.model.generate_stream(\n                        input_messages,\n                        stop_sequences=[\"<end_plan>\"],\n                    ):  # type: ignore\n                        if event.content is not None:\n                            plan_message_content += event.content\n                            live.update(Markdown(plan_message_content))\n                            if event.token_usage:\n                                input_tokens = event.token_usage.input_tokens\n                                output_tokens += event.token_usage.output_tokens\n                        yield event\n            else:\n                plan_message = self.model.generate(input_messages, stop_sequences=[\"<end_plan>\"])\n                plan_message_content = plan_message.content\n                input_tokens, output_tokens = 0, 0\n                if plan_message.token_usage:\n                    input_tokens = plan_message.token_usage.input_tokens\n                    output_tokens = plan_message.token_usage.output_tokens\n            plan = textwrap.dedent(\n                f\"\"\"I still need to solve the task I was given:\\n```\\n{self.task}\\n```\\n\\nHere are the facts I know and my new/updated plan of action to solve the task:\\n```\\n{plan_message_content}\\n```\"\"\"\n            )\n        log_headline = \"Initial plan\" if is_first_step else \"Updated plan\"\n        self.logger.log(Rule(f\"[bold]{log_headline}\", style=\"orange\"), Text(plan), level=LogLevel.INFO)\n        yield PlanningStep(\n            model_input_messages=input_messages,\n            plan=plan,\n            model_output_message=ChatMessage(role=MessageRole.ASSISTANT, content=plan_message_content),\n            token_usage=TokenUsage(input_tokens=input_tokens, output_tokens=output_tokens),\n            timing=Timing(start_time=start_time, end_time=time.time()),\n        )\n\n    @abstractmethod\n    def initialize_system_prompt(self) -> str:\n        \"\"\"To be implemented in child classes\"\"\"\n        ...\n\n    def interrupt(self):\n        \"\"\"Interrupts the agent execution.\"\"\"\n        self.interrupt_switch = True\n\n    def write_memory_to_messages(\n        self,\n        summary_mode: bool = False,\n    ) -> list[ChatMessage]:\n        \"\"\"\n        Reads past llm_outputs, actions, and observations or errors from the memory into a series of messages\n        that can be used as input to the LLM. Adds a number of keywords (such as PLAN, error, etc) to help\n        the LLM.\n        \"\"\"\n        messages = self.memory.system_prompt.to_messages(summary_mode=summary_mode)\n        for memory_step in self.memory.steps:\n            messages.extend(memory_step.to_messages(summary_mode=summary_mode))\n        return messages\n\n    def _step_stream(\n        self, memory_step: ActionStep\n    ) -> Generator[ChatMessageStreamDelta | ToolCall | ToolOutput | ActionOutput]:\n        \"\"\"\n        Perform one step in the ReAct framework: the agent thinks, acts, and observes the result.\n        Yields ChatMessageStreamDelta during the run if streaming is enabled.\n        At the end, yields either None if the step is not final, or the final answer.\n        \"\"\"\n        raise NotImplementedError(\"This method should be implemented in child classes\")\n\n    def step(self, memory_step: ActionStep) -> Any:\n        \"\"\"\n        Perform one step in the ReAct framework: the agent thinks, acts, and observes the result.\n        Returns either None if the step is not final, or the final answer.\n        \"\"\"\n        return list(self._step_stream(memory_step))[-1]\n\n    def extract_action(self, model_output: str, split_token: str) -> tuple[str, str]:\n        \"\"\"\n        Parse action from the LLM output\n\n        Args:\n            model_output (`str`): Output of the LLM\n            split_token (`str`): Separator for the action. Should match the example in the system prompt.\n        \"\"\"\n        try:\n            split = model_output.split(split_token)\n            rationale, action = (\n                split[-2],\n                split[-1],\n            )  # NOTE: using indexes starting from the end solves for when you have more than one split_token in the output\n        except Exception:\n            raise AgentParsingError(\n                f\"No '{split_token}' token provided in your output.\\nYour output:\\n{model_output}\\n. Be sure to include an action, prefaced with '{split_token}'!\",\n                self.logger,\n            )\n        return rationale.strip(), action.strip()\n\n    def provide_final_answer(self, task: str) -> ChatMessage:\n        \"\"\"\n        Provide the final answer to the task, based on the logs of the agent's interactions.\n\n        Args:\n            task (`str`): Task to perform.\n            images (`list[PIL.Image.Image]`, *optional*): Image(s) objects.\n\n        Returns:\n            `str`: Final answer to the task.\n        \"\"\"\n        messages = [\n            ChatMessage(\n                role=MessageRole.SYSTEM,\n                content=[\n                    {\n                        \"type\": \"text\",\n                        \"text\": self.prompt_templates[\"final_answer\"][\"pre_messages\"],\n                    }\n                ],\n            )\n        ]\n        messages += self.write_memory_to_messages()[1:]\n        messages.append(\n            ChatMessage(\n                role=MessageRole.USER,\n                content=[\n                    {\n                        \"type\": \"text\",\n                        \"text\": populate_template(\n                            self.prompt_templates[\"final_answer\"][\"post_messages\"], variables={\"task\": task}\n                        ),\n                    }\n                ],\n            )\n        )\n        try:\n            chat_message: ChatMessage = self.model.generate(messages)\n            return chat_message\n        except Exception as e:\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=[{\"type\": \"text\", \"text\": f\"Error in generating final LLM output: {e}\"}],\n            )\n\n    def visualize(self):\n        \"\"\"Creates a rich tree visualization of the agent's structure.\"\"\"\n        self.logger.visualize_agent_tree(self)\n\n    def replay(self, detailed: bool = False):\n        \"\"\"Prints a pretty replay of the agent's steps.\n\n        Args:\n            detailed (bool, optional): If True, also displays the memory at each step. Defaults to False.\n                Careful: will increase log length exponentially. Use only for debugging.\n        \"\"\"\n        self.memory.replay(self.logger, detailed=detailed)\n\n    def __call__(self, task: str, **kwargs):\n        \"\"\"Adds additional prompting for the managed agent, runs it, and wraps the output.\n        This method is called only by a managed agent.\n        \"\"\"\n        full_task = populate_template(\n            self.prompt_templates[\"managed_agent\"][\"task\"],\n            variables=dict(name=self.name, task=task),\n        )\n        result = self.run(full_task, **kwargs)\n        if isinstance(result, RunResult):\n            report = result.output\n        else:\n            report = result\n        answer = populate_template(\n            self.prompt_templates[\"managed_agent\"][\"report\"], variables=dict(name=self.name, final_answer=report)\n        )\n        if self.provide_run_summary:\n            answer += \"\\n\\nFor more detail, find below a summary of this agent's work:\\n<summary_of_work>\\n\"\n            for message in self.write_memory_to_messages(summary_mode=True):\n                content = message.content\n                answer += \"\\n\" + truncate_content(str(content)) + \"\\n---\"\n            answer += \"\\n</summary_of_work>\"\n        return answer\n\n    def save(self, output_dir: str | Path, relative_path: str | None = None):\n        \"\"\"\n        Saves the relevant code files for your agent. This will copy the code of your agent in `output_dir` as well as autogenerate:\n\n        - a `tools` folder containing the logic for each of the tools under `tools/{tool_name}.py`.\n        - a `managed_agents` folder containing the logic for each of the managed agents.\n        - an `agent.json` file containing a dictionary representing your agent.\n        - a `prompt.yaml` file containing the prompt templates used by your agent.\n        - an `app.py` file providing a UI for your agent when it is exported to a Space with `agent.push_to_hub()`\n        - a `requirements.txt` containing the names of the modules used by your tool (as detected when inspecting its\n          code)\n\n        Args:\n            output_dir (`str` or `Path`): The folder in which you want to save your agent.\n        \"\"\"\n        make_init_file(output_dir)\n\n        # Recursively save managed agents\n        if self.managed_agents:\n            make_init_file(os.path.join(output_dir, \"managed_agents\"))\n            for agent_name, agent in self.managed_agents.items():\n                agent_suffix = f\"managed_agents.{agent_name}\"\n                if relative_path:\n                    agent_suffix = relative_path + \".\" + agent_suffix\n                agent.save(os.path.join(output_dir, \"managed_agents\", agent_name), relative_path=agent_suffix)\n\n        class_name = self.__class__.__name__\n\n        # Save tools to different .py files\n        for tool in self.tools.values():\n            make_init_file(os.path.join(output_dir, \"tools\"))\n            tool.save(os.path.join(output_dir, \"tools\"), tool_file_name=tool.name, make_gradio_app=False)\n\n        # Save prompts to yaml\n        yaml_prompts = yaml.safe_dump(\n            self.prompt_templates,\n            default_style=\"|\",  # This forces block literals for all strings\n            default_flow_style=False,\n            width=float(\"inf\"),\n            sort_keys=False,\n            allow_unicode=True,\n            indent=2,\n        )\n\n        with open(os.path.join(output_dir, \"prompts.yaml\"), \"w\", encoding=\"utf-8\") as f:\n            f.write(yaml_prompts)\n\n        # Save agent dictionary to json\n        agent_dict = self.to_dict()\n        agent_dict[\"tools\"] = [tool.name for tool in self.tools.values()]\n        agent_dict[\"managed_agents\"] = {agent.name: agent.__class__.__name__ for agent in self.managed_agents.values()}\n        with open(os.path.join(output_dir, \"agent.json\"), \"w\", encoding=\"utf-8\") as f:\n            json.dump(agent_dict, f, indent=4)\n\n        # Save requirements\n        with open(os.path.join(output_dir, \"requirements.txt\"), \"w\", encoding=\"utf-8\") as f:\n            f.writelines(f\"{r}\\n\" for r in agent_dict[\"requirements\"])\n\n        # Make agent.py file with Gradio UI\n        agent_name = f\"agent_{self.name}\" if getattr(self, \"name\", None) else \"agent\"\n        managed_agent_relative_path = relative_path + \".\" if relative_path is not None else \"\"\n        app_template = create_agent_gradio_app_template()\n\n        # Render the app.py file from Jinja2 template\n        app_text = app_template.render(\n            {\n                \"agent_name\": agent_name,\n                \"class_name\": class_name,\n                \"agent_dict\": agent_dict,\n                \"tools\": self.tools,\n                \"managed_agents\": self.managed_agents,\n                \"managed_agent_relative_path\": managed_agent_relative_path,\n            }\n        )\n\n        with open(os.path.join(output_dir, \"app.py\"), \"w\", encoding=\"utf-8\") as f:\n            f.write(app_text + \"\\n\")  # Append newline at the end\n\n    def to_dict(self) -> dict[str, Any]:\n        \"\"\"Convert the agent to a dictionary representation.\n\n        Returns:\n            `dict`: Dictionary representation of the agent.\n        \"\"\"\n        # TODO: handle serializing step_callbacks and final_answer_checks\n        for attr in [\"final_answer_checks\", \"step_callbacks\"]:\n            if getattr(self, attr, None):\n                self.logger.log(f\"This agent has {attr}: they will be ignored by this method.\", LogLevel.INFO)\n\n        tool_dicts = [tool.to_dict() for tool in self.tools.values()]\n        tool_requirements = {req for tool in self.tools.values() for req in tool.to_dict()[\"requirements\"]}\n        managed_agents_requirements = {\n            req for managed_agent in self.managed_agents.values() for req in managed_agent.to_dict()[\"requirements\"]\n        }\n        requirements = tool_requirements | managed_agents_requirements\n        if hasattr(self, \"authorized_imports\"):\n            requirements.update(\n                {package.split(\".\")[0] for package in self.authorized_imports if package not in BASE_BUILTIN_MODULES}\n            )\n\n        agent_dict = {\n            \"class\": self.__class__.__name__,\n            \"tools\": tool_dicts,\n            \"model\": {\n                \"class\": self.model.__class__.__name__,\n                \"data\": self.model.to_dict(),\n            },\n            \"managed_agents\": [managed_agent.to_dict() for managed_agent in self.managed_agents.values()],\n            \"prompt_templates\": self.prompt_templates,\n            \"max_steps\": self.max_steps,\n            \"verbosity_level\": int(self.logger.level),\n            \"planning_interval\": self.planning_interval,\n            \"name\": self.name,\n            \"description\": self.description,\n            \"requirements\": sorted(requirements),\n        }\n        return agent_dict\n\n    @classmethod\n    def from_dict(cls, agent_dict: dict[str, Any], **kwargs) -> \"MultiStepAgent\":\n        \"\"\"Create agent from a dictionary representation.\n\n        Args:\n            agent_dict (`dict[str, Any]`): Dictionary representation of the agent.\n            **kwargs: Additional keyword arguments that will override agent_dict values.\n\n        Returns:\n            `MultiStepAgent`: Instance of the agent class.\n        \"\"\"\n        # Load model\n        model_info = agent_dict[\"model\"]\n        model_class = MODEL_REGISTRY.get(model_info[\"class\"])\n        if model_class is None:\n            raise ValueError(\n                f\"Unknown model class '{model_info['class']}'. \"\n                f\"Supported models: {', '.join(sorted(MODEL_REGISTRY.keys()))}\"\n            )\n        model = model_class.from_dict(model_info[\"data\"])\n        # Load tools\n        tools = []\n        for tool_info in agent_dict[\"tools\"]:\n            tools.append(Tool.from_code(tool_info[\"code\"]))\n        # Load managed agents\n        managed_agents = []\n        for managed_agent_dict in agent_dict[\"managed_agents\"]:\n            agent_class = AGENT_REGISTRY.get(managed_agent_dict[\"class\"])\n            if agent_class is None:\n                raise ValueError(\n                    f\"Unknown agent class '{managed_agent_dict['class']}'. \"\n                    f\"Supported agents: {', '.join(sorted(AGENT_REGISTRY.keys()))}\"\n                )\n            managed_agent = agent_class.from_dict(managed_agent_dict, **kwargs)\n            managed_agents.append(managed_agent)\n        # Extract base agent parameters\n        agent_args = {\n            \"model\": model,\n            \"tools\": tools,\n            \"managed_agents\": managed_agents,\n            \"prompt_templates\": agent_dict.get(\"prompt_templates\"),\n            \"max_steps\": agent_dict.get(\"max_steps\"),\n            \"verbosity_level\": agent_dict.get(\"verbosity_level\"),\n            \"planning_interval\": agent_dict.get(\"planning_interval\"),\n            \"name\": agent_dict.get(\"name\"),\n            \"description\": agent_dict.get(\"description\"),\n        }\n        # Filter out None values to use defaults from __init__\n        agent_args = {k: v for k, v in agent_args.items() if v is not None}\n        # Update with any additional kwargs\n        agent_args.update(kwargs)\n        # Create agent instance\n        return cls(**agent_args)\n\n    @classmethod\n    def from_hub(\n        cls,\n        repo_id: str,\n        token: str | None = None,\n        trust_remote_code: bool = False,\n        **kwargs,\n    ):\n        \"\"\"\n        Loads an agent defined on the Hub.\n\n        <Tip warning={true}>\n\n        Loading a tool from the Hub means that you'll download the tool and execute it locally.\n        ALWAYS inspect the tool you're downloading before loading it within your runtime, as you would do when\n        installing a package using pip/npm/apt.\n\n        </Tip>\n\n        Args:\n            repo_id (`str`):\n                The name of the repo on the Hub where your tool is defined.\n            token (`str`, *optional*):\n                The token to identify you on hf.co. If unset, will use the token generated when running\n                `huggingface-cli login` (stored in `~/.huggingface`).\n            trust_remote_code(`bool`, *optional*, defaults to False):\n                This flags marks that you understand the risk of running remote code and that you trust this tool.\n                If not setting this to True, loading the tool from Hub will fail.\n            kwargs (additional keyword arguments, *optional*):\n                Additional keyword arguments that will be split in two: all arguments relevant to the Hub (such as\n                `cache_dir`, `revision`, `subfolder`) will be used when downloading the files for your agent, and the\n                others will be passed along to its init.\n        \"\"\"\n        if not trust_remote_code:\n            raise ValueError(\n                \"Loading an agent from Hub requires to acknowledge you trust its code: to do so, pass `trust_remote_code=True`.\"\n            )\n\n        # Get the agent's Hub folder.\n        download_kwargs = {\"token\": token, \"repo_type\": \"space\"} | {\n            key: kwargs.pop(key)\n            for key in [\n                \"cache_dir\",\n                \"force_download\",\n                \"proxies\",\n                \"revision\",\n                \"local_files_only\",\n            ]\n            if key in kwargs\n        }\n\n        download_folder = Path(snapshot_download(repo_id=repo_id, **download_kwargs))\n        return cls.from_folder(download_folder, **kwargs)\n\n    @classmethod\n    def from_folder(cls, folder: str | Path, **kwargs):\n        \"\"\"Loads an agent from a local folder.\n\n        Args:\n            folder (`str` or `Path`): The folder where the agent is saved.\n            **kwargs: Additional keyword arguments that will be passed to the agent's init.\n        \"\"\"\n        # Load agent.json\n        folder = Path(folder)\n        agent_dict = json.loads((folder / \"agent.json\").read_text())\n        # Handle HfApiModel -> InferenceClientModel rename for old agents\n        if agent_dict.get(\"model\", {}).get(\"class\") == \"HfApiModel\":\n            agent_dict[\"model\"][\"class\"] = \"InferenceClientModel\"\n            logger.warning(\n                \"The agent you're loading uses the deprecated 'HfApiModel' class: it was automatically updated to 'InferenceClientModel'.\"\n            )\n        # Load managed agents from their respective folders, recursively\n        managed_agents = []\n        for managed_agent_name, managed_agent_class_name in agent_dict[\"managed_agents\"].items():\n            agent_cls = AGENT_REGISTRY.get(managed_agent_class_name)\n            if agent_cls is None:\n                raise ValueError(\n                    f\"Unknown agent class '{managed_agent_class_name}'. \"\n                    f\"Supported agents: {', '.join(sorted(AGENT_REGISTRY.keys()))}\"\n                )\n            managed_agents.append(agent_cls.from_folder(folder / \"managed_agents\" / managed_agent_name))\n        agent_dict[\"managed_agents\"] = {}\n\n        # Load tools\n        tools = []\n        for tool_name in agent_dict[\"tools\"]:\n            tool_code = (folder / \"tools\" / f\"{tool_name}.py\").read_text()\n            tools.append({\"name\": tool_name, \"code\": tool_code})\n        agent_dict[\"tools\"] = tools\n\n        # Add managed agents to kwargs to override the empty list in from_dict\n        if managed_agents:\n            kwargs[\"managed_agents\"] = managed_agents\n\n        return cls.from_dict(agent_dict, **kwargs)\n\n    def push_to_hub(\n        self,\n        repo_id: str,\n        commit_message: str = \"Upload agent\",\n        private: bool | None = None,\n        token: bool | str | None = None,\n        create_pr: bool = False,\n    ) -> str:\n        \"\"\"\n        Upload the agent to the Hub.\n\n        Parameters:\n            repo_id (`str`):\n                The name of the repository you want to push to. It should contain your organization name when\n                pushing to a given organization.\n            commit_message (`str`, *optional*, defaults to `\"Upload agent\"`):\n                Message to commit while pushing.\n            private (`bool`, *optional*, defaults to `None`):\n                Whether to make the repo private. If `None`, the repo will be public unless the organization's default is private. This value is ignored if the repo already exists.\n            token (`bool` or `str`, *optional*):\n                The token to use as HTTP bearer authorization for remote files. If unset, will use the token generated\n                when running `huggingface-cli login` (stored in `~/.huggingface`).\n            create_pr (`bool`, *optional*, defaults to `False`):\n                Whether to create a PR with the uploaded files or directly commit.\n        \"\"\"\n        repo_url = create_repo(\n            repo_id=repo_id,\n            token=token,\n            private=private,\n            exist_ok=True,\n            repo_type=\"space\",\n            space_sdk=\"gradio\",\n        )\n        repo_id = repo_url.repo_id\n        metadata_update(\n            repo_id,\n            {\"tags\": [\"smolagents\", \"agent\"]},\n            repo_type=\"space\",\n            token=token,\n            overwrite=True,\n        )\n\n        with tempfile.TemporaryDirectory() as work_dir:\n            self.save(work_dir)\n            logger.info(f\"Uploading the following files to {repo_id}: {','.join(os.listdir(work_dir))}\")\n            return upload_folder(\n                repo_id=repo_id,\n                commit_message=commit_message,\n                folder_path=work_dir,\n                token=token,\n                create_pr=create_pr,\n                repo_type=\"space\",\n            )\n\n\nclass ToolCallingAgent(MultiStepAgent):\n    \"\"\"\n    This agent uses JSON-like tool calls, using method `model.get_tool_call` to leverage the LLM engine's tool calling capabilities.\n\n    Args:\n        tools (`list[Tool]`): [`Tool`]s that the agent can use.\n        model (`Model`): Model that will generate the agent's actions.\n        prompt_templates ([`~agents.PromptTemplates`], *optional*): Prompt templates.\n        planning_interval (`int`, *optional*): Interval at which the agent will run a planning step.\n        stream_outputs (`bool`, *optional*, default `False`): Whether to stream outputs during execution.\n        max_tool_threads (`int`, *optional*): Maximum number of threads for parallel tool calls.\n            Higher values increase concurrency but resource usage as well.\n            Defaults to `ThreadPoolExecutor`'s default.\n        **kwargs: Additional keyword arguments.\n    \"\"\"\n\n    def __init__(\n        self,\n        tools: list[Tool],\n        model: Model,\n        prompt_templates: PromptTemplates | None = None,\n        planning_interval: int | None = None,\n        stream_outputs: bool = False,\n        max_tool_threads: int | None = None,\n        **kwargs,\n    ):\n        prompt_templates = prompt_templates or yaml.safe_load(\n            importlib.resources.files(\"smolagents.prompts\").joinpath(\"toolcalling_agent.yaml\").read_text()\n        )\n        super().__init__(\n            tools=tools,\n            model=model,\n            prompt_templates=prompt_templates,\n            planning_interval=planning_interval,\n            **kwargs,\n        )\n        # Streaming setup\n        self.stream_outputs = stream_outputs\n        if self.stream_outputs and not hasattr(self.model, \"generate_stream\"):\n            raise ValueError(\n                \"`stream_outputs` is set to True, but the model class implements no `generate_stream` method.\"\n            )\n        # Tool calling setup\n        self.max_tool_threads = max_tool_threads\n\n    @property\n    def tools_and_managed_agents(self):\n        \"\"\"Returns a combined list of tools and managed agents.\"\"\"\n        return list(self.tools.values()) + list(self.managed_agents.values())\n\n    def initialize_system_prompt(self) -> str:\n        system_prompt = populate_template(\n            self.prompt_templates[\"system_prompt\"],\n            variables={\n                \"tools\": self.tools,\n                \"managed_agents\": self.managed_agents,\n                \"custom_instructions\": self.instructions,\n            },\n        )\n        return system_prompt\n\n    def _step_stream(\n        self, memory_step: ActionStep\n    ) -> Generator[ChatMessageStreamDelta | ToolCall | ToolOutput | ActionOutput]:\n        \"\"\"\n        Perform one step in the ReAct framework: the agent thinks, acts, and observes the result.\n        Yields ChatMessageStreamDelta during the run if streaming is enabled.\n        At the end, yields either None if the step is not final, or the final answer.\n        \"\"\"\n        memory_messages = self.write_memory_to_messages()\n\n        input_messages = memory_messages.copy()\n\n        # Add new step in logs\n        memory_step.model_input_messages = input_messages\n\n        try:\n            if self.stream_outputs and hasattr(self.model, \"generate_stream\"):\n                output_stream = self.model.generate_stream(\n                    input_messages,\n                    stop_sequences=[\"Observation:\", \"Calling tools:\"],\n                    tools_to_call_from=self.tools_and_managed_agents,\n                )\n\n                chat_message_stream_deltas: list[ChatMessageStreamDelta] = []\n                with Live(\"\", console=self.logger.console, vertical_overflow=\"visible\") as live:\n                    for event in output_stream:\n                        chat_message_stream_deltas.append(event)\n                        live.update(\n                            Markdown(agglomerate_stream_deltas(chat_message_stream_deltas).render_as_markdown())\n                        )\n                        yield event\n                chat_message = agglomerate_stream_deltas(chat_message_stream_deltas)\n            else:\n                chat_message: ChatMessage = self.model.generate(\n                    input_messages,\n                    stop_sequences=[\"Observation:\", \"Calling tools:\"],\n                    tools_to_call_from=self.tools_and_managed_agents,\n                )\n                self.logger.log_markdown(\n                    content=str(chat_message.content or chat_message.raw or \"\"),\n                    title=\"Output message of the LLM:\",\n                    level=LogLevel.DEBUG,\n                )\n\n            # Record model output\n            memory_step.model_output_message = chat_message\n            memory_step.model_output = chat_message.content\n            memory_step.token_usage = chat_message.token_usage\n        except Exception as e:\n            raise AgentGenerationError(f\"Error while generating output:\\n{e}\", self.logger) from e\n\n        if chat_message.tool_calls is None or len(chat_message.tool_calls) == 0:\n            try:\n                chat_message = self.model.parse_tool_calls(chat_message)\n            except Exception as e:\n                raise AgentParsingError(f\"Error while parsing tool call from model output: {e}\", self.logger)\n        else:\n            for tool_call in chat_message.tool_calls:\n                tool_call.function.arguments = parse_json_if_needed(tool_call.function.arguments)\n        final_answer, got_final_answer = None, False\n        for output in self.process_tool_calls(chat_message, memory_step):\n            yield output\n            if isinstance(output, ToolOutput):\n                if output.is_final_answer:\n                    if len(chat_message.tool_calls) > 1:\n                        raise AgentExecutionError(\n                            \"If you want to return an answer, please do not perform any other tool calls than the final answer tool call!\",\n                            self.logger,\n                        )\n                    if got_final_answer:\n                        raise AgentToolExecutionError(\n                            \"You returned multiple final answers. Please return only one single final answer!\",\n                            self.logger,\n                        )\n                    final_answer = output.output\n                    got_final_answer = True\n\n                    # Manage state variables\n                    if isinstance(final_answer, str) and final_answer in self.state.keys():\n                        final_answer = self.state[final_answer]\n        yield ActionOutput(\n            output=final_answer,\n            is_final_answer=got_final_answer,\n        )\n\n    def process_tool_calls(\n        self, chat_message: ChatMessage, memory_step: ActionStep\n    ) -> Generator[ToolCall | ToolOutput]:\n        \"\"\"Process tool calls from the model output and update agent memory.\n\n        Args:\n            chat_message (`ChatMessage`): Chat message containing tool calls from the model.\n            memory_step (`ActionStep)`: Memory ActionStep to update with results.\n\n        Yields:\n            `ToolCall | ToolOutput`: The tool call or tool output.\n        \"\"\"\n        parallel_calls: dict[str, ToolCall] = {}\n        assert chat_message.tool_calls is not None\n        for chat_tool_call in chat_message.tool_calls:\n            tool_call = ToolCall(\n                name=chat_tool_call.function.name, arguments=chat_tool_call.function.arguments, id=chat_tool_call.id\n            )\n            yield tool_call\n            parallel_calls[tool_call.id] = tool_call\n\n        # Helper function to process a single tool call\n        def process_single_tool_call(tool_call: ToolCall) -> ToolOutput:\n            tool_name = tool_call.name\n            tool_arguments = tool_call.arguments or {}\n            self.logger.log(\n                Panel(Text(f\"Calling tool: '{tool_name}' with arguments: {tool_arguments}\")),\n                level=LogLevel.INFO,\n            )\n            tool_call_result = self.execute_tool_call(tool_name, tool_arguments)\n            tool_call_result_type = type(tool_call_result)\n            if tool_call_result_type in [AgentImage, AgentAudio]:\n                if tool_call_result_type == AgentImage:\n                    observation_name = \"image.png\"\n                elif tool_call_result_type == AgentAudio:\n                    observation_name = \"audio.mp3\"\n                # TODO: tool_call_result naming could allow for different names of same type\n                self.state[observation_name] = tool_call_result\n                observation = f\"Stored '{observation_name}' in memory.\"\n            else:\n                observation = str(tool_call_result).strip()\n            self.logger.log(\n                f\"Observations: {observation.replace('[', '|')}\",  # escape potential rich-tag-like components\n                level=LogLevel.INFO,\n            )\n            is_final_answer = tool_name == \"final_answer\"\n\n            return ToolOutput(\n                id=tool_call.id,\n                output=tool_call_result,\n                is_final_answer=is_final_answer,\n                observation=observation,\n                tool_call=tool_call,\n            )\n\n        # Process tool calls in parallel\n        outputs = {}\n        if len(parallel_calls) == 1:\n            # If there's only one call, process it directly\n            tool_call = list(parallel_calls.values())[0]\n            tool_output = process_single_tool_call(tool_call)\n            outputs[tool_output.id] = tool_output\n            yield tool_output\n        else:\n            # If multiple tool calls, process them in parallel\n            with ThreadPoolExecutor(self.max_tool_threads) as executor:\n                futures = []\n                for tool_call in parallel_calls.values():\n                    ctx = copy_context()\n                    futures.append(executor.submit(ctx.run, process_single_tool_call, tool_call))\n                for future in as_completed(futures):\n                    tool_output = future.result()\n                    outputs[tool_output.id] = tool_output\n                    yield tool_output\n\n        memory_step.tool_calls = [parallel_calls[k] for k in sorted(parallel_calls.keys())]\n        memory_step.observations = memory_step.observations or \"\"\n        for tool_output in [outputs[k] for k in sorted(outputs.keys())]:\n            memory_step.observations += tool_output.observation + \"\\n\"\n        memory_step.observations = (\n            memory_step.observations.rstrip(\"\\n\") if memory_step.observations else memory_step.observations\n        )\n\n    def _substitute_state_variables(self, arguments: dict[str, str] | str) -> dict[str, Any] | str:\n        \"\"\"Replace string values in arguments with their corresponding state values if they exist.\"\"\"\n        if isinstance(arguments, dict):\n            return {\n                key: self.state.get(value, value) if isinstance(value, str) else value\n                for key, value in arguments.items()\n            }\n        return arguments\n\n    def execute_tool_call(self, tool_name: str, arguments: dict[str, str] | str) -> Any:\n        \"\"\"\n        Execute a tool or managed agent with the provided arguments.\n\n        The arguments are replaced with the actual values from the state if they refer to state variables.\n\n        Args:\n            tool_name (`str`): Name of the tool or managed agent to execute.\n            arguments (dict[str, str] | str): Arguments passed to the tool call.\n        \"\"\"\n        # Check if the tool exists\n        available_tools = {**self.tools, **self.managed_agents}\n        if tool_name not in available_tools:\n            raise AgentToolExecutionError(\n                f\"Unknown tool {tool_name}, should be one of: {', '.join(available_tools)}.\", self.logger\n            )\n\n        # Get the tool and substitute state variables in arguments\n        tool = available_tools[tool_name]\n        arguments = self._substitute_state_variables(arguments)\n        is_managed_agent = tool_name in self.managed_agents\n\n        try:\n            validate_tool_arguments(tool, arguments)\n        except (ValueError, TypeError) as e:\n            raise AgentToolCallError(str(e), self.logger) from e\n        except Exception as e:\n            error_msg = f\"Error executing tool '{tool_name}' with arguments {str(arguments)}: {type(e).__name__}: {e}\"\n            raise AgentToolExecutionError(error_msg, self.logger) from e\n\n        try:\n            # Call tool with appropriate arguments\n            if isinstance(arguments, dict):\n                return tool(**arguments) if is_managed_agent else tool(**arguments, sanitize_inputs_outputs=True)\n            else:\n                return tool(arguments) if is_managed_agent else tool(arguments, sanitize_inputs_outputs=True)\n\n        except Exception as e:\n            # Handle execution errors\n            if is_managed_agent:\n                error_msg = (\n                    f\"Error executing request to team member '{tool_name}' with arguments {str(arguments)}: {e}\\n\"\n                    \"Please try again or request to another team member\"\n                )\n            else:\n                error_msg = (\n                    f\"Error executing tool '{tool_name}' with arguments {str(arguments)}: {type(e).__name__}: {e}\\n\"\n                    \"Please try again or use another tool\"\n                )\n            raise AgentToolExecutionError(error_msg, self.logger) from e\n\n\nclass CodeAgent(MultiStepAgent):\n    \"\"\"\n    In this agent, the tool calls will be formulated by the LLM in code format, then parsed and executed.\n\n    Args:\n        tools (`list[Tool]`): [`Tool`]s that the agent can use.\n        model (`Model`): Model that will generate the agent's actions.\n        prompt_templates ([`~agents.PromptTemplates`], *optional*): Prompt templates.\n        additional_authorized_imports (`list[str]`, *optional*): Additional authorized imports for the agent.\n        planning_interval (`int`, *optional*): Interval at which the agent will run a planning step.\n        executor ([`PythonExecutor`], *optional*): Custom Python code executor. If not provided, a default executor will be created based on `executor_type`.\n        executor_type (`Literal[\"local\", \"blaxel\", \"e2b\", \"modal\", \"docker\", \"wasm\"]`, default `\"local\"`): Type of code executor.\n        executor_kwargs (`dict`, *optional*): Additional arguments to pass to initialize the executor.\n        max_print_outputs_length (`int`, *optional*): Maximum length of the print outputs.\n        stream_outputs (`bool`, *optional*, default `False`): Whether to stream outputs during execution.\n        use_structured_outputs_internally (`bool`, default `False`): Whether to use structured generation at each action step: improves performance for many models.\n\n            <Added version=\"1.17.0\"/>\n        code_block_tags (`tuple[str, str]` | `Literal[\"markdown\"]`, *optional*): Opening and closing tags for code blocks (regex strings). Pass a custom tuple, or pass 'markdown' to use (\"```(?:python|py)\", \"\\\\n```\"), leave empty to use (\"<code>\", \"</code>\").\n        **kwargs: Additional keyword arguments.\n    \"\"\"\n\n    def __init__(\n        self,\n        tools: list[Tool],\n        model: Model,\n        prompt_templates: PromptTemplates | None = None,\n        additional_authorized_imports: list[str] | None = None,\n        planning_interval: int | None = None,\n        executor: PythonExecutor = None,\n        executor_type: Literal[\"local\", \"blaxel\", \"e2b\", \"modal\", \"docker\", \"wasm\"] = \"local\",\n        executor_kwargs: dict[str, Any] | None = None,\n        max_print_outputs_length: int | None = None,\n        stream_outputs: bool = False,\n        use_structured_outputs_internally: bool = False,\n        code_block_tags: str | tuple[str, str] | None = None,\n        **kwargs,\n    ):\n        self.additional_authorized_imports = additional_authorized_imports if additional_authorized_imports else []\n        self.authorized_imports = sorted(set(BASE_BUILTIN_MODULES) | set(self.additional_authorized_imports))\n        self.max_print_outputs_length = max_print_outputs_length\n        self._use_structured_outputs_internally = use_structured_outputs_internally\n        if self._use_structured_outputs_internally:\n            prompt_templates = prompt_templates or yaml.safe_load(\n                importlib.resources.files(\"smolagents.prompts\").joinpath(\"structured_code_agent.yaml\").read_text()\n            )\n        else:\n            prompt_templates = prompt_templates or yaml.safe_load(\n                importlib.resources.files(\"smolagents.prompts\").joinpath(\"code_agent.yaml\").read_text()\n            )\n\n        if isinstance(code_block_tags, str) and not code_block_tags == \"markdown\":\n            raise ValueError(\"Only 'markdown' is supported for a string argument to `code_block_tags`.\")\n        self.code_block_tags = (\n            code_block_tags\n            if isinstance(code_block_tags, tuple)\n            else (\"```python\", \"```\")\n            if code_block_tags == \"markdown\"\n            else (\"<code>\", \"</code>\")\n        )\n\n        super().__init__(\n            tools=tools,\n            model=model,\n            prompt_templates=prompt_templates,\n            planning_interval=planning_interval,\n            **kwargs,\n        )\n        self.stream_outputs = stream_outputs\n        if self.stream_outputs and not hasattr(self.model, \"generate_stream\"):\n            raise ValueError(\n                \"`stream_outputs` is set to True, but the model class implements no `generate_stream` method.\"\n            )\n        if \"*\" in self.additional_authorized_imports:\n            self.logger.log(\n                \"Caution: you set an authorization for all imports, meaning your agent can decide to import any package it deems necessary. This might raise issues if the package is not installed in your environment.\",\n                level=LogLevel.INFO,\n            )\n        self.executor_type = executor_type\n        self.executor_kwargs: dict[str, Any] = executor_kwargs or {}\n        self.python_executor = executor or self.create_python_executor()\n\n    def __enter__(self):\n        return self\n\n    def __exit__(self, exc_type, exc_value, traceback):\n        self.cleanup()\n\n    def cleanup(self):\n        \"\"\"Clean up resources used by the agent, such as the remote Python executor.\"\"\"\n        if hasattr(self.python_executor, \"cleanup\"):\n            self.python_executor.cleanup()\n\n    def create_python_executor(self) -> PythonExecutor:\n        if self.executor_type not in {\"local\", \"blaxel\", \"e2b\", \"modal\", \"docker\", \"wasm\"}:\n            raise ValueError(f\"Unsupported executor type: {self.executor_type}\")\n\n        if self.executor_type == \"local\":\n            return LocalPythonExecutor(\n                self.additional_authorized_imports,\n                **{\"max_print_outputs_length\": self.max_print_outputs_length} | self.executor_kwargs,\n            )\n        else:\n            if self.managed_agents:\n                raise Exception(\"Managed agents are not yet supported with remote code execution.\")\n            remote_executors = {\n                \"blaxel\": BlaxelExecutor,\n                \"e2b\": E2BExecutor,\n                \"docker\": DockerExecutor,\n                \"wasm\": WasmExecutor,\n                \"modal\": ModalExecutor,\n            }\n            return remote_executors[self.executor_type](\n                self.additional_authorized_imports, self.logger, **self.executor_kwargs\n            )\n\n    def initialize_system_prompt(self) -> str:\n        system_prompt = populate_template(\n            self.prompt_templates[\"system_prompt\"],\n            variables={\n                \"tools\": self.tools,\n                \"managed_agents\": self.managed_agents,\n                \"authorized_imports\": (\n                    \"You can import from any package you want.\"\n                    if \"*\" in self.authorized_imports\n                    else str(self.authorized_imports)\n                ),\n                \"custom_instructions\": self.instructions,\n                \"code_block_opening_tag\": self.code_block_tags[0],\n                \"code_block_closing_tag\": self.code_block_tags[1],\n            },\n        )\n        return system_prompt\n\n    def _step_stream(\n        self, memory_step: ActionStep\n    ) -> Generator[ChatMessageStreamDelta | ToolCall | ToolOutput | ActionOutput]:\n        \"\"\"\n        Perform one step in the ReAct framework: the agent thinks, acts, and observes the result.\n        Yields ChatMessageStreamDelta during the run if streaming is enabled.\n        At the end, yields either None if the step is not final, or the final answer.\n        \"\"\"\n        memory_messages = self.write_memory_to_messages()\n\n        input_messages = memory_messages.copy()\n        ### Generate model output ###\n        memory_step.model_input_messages = input_messages\n        stop_sequences = [\"Observation:\", \"Calling tools:\"]\n        if self.code_block_tags[1] not in self.code_block_tags[0]:\n            # If the closing tag is contained in the opening tag, adding it as a stop sequence would cut short any code generation\n            stop_sequences.append(self.code_block_tags[1])\n        try:\n            additional_args: dict[str, Any] = {}\n            if self._use_structured_outputs_internally:\n                additional_args[\"response_format\"] = CODEAGENT_RESPONSE_FORMAT\n            if self.stream_outputs:\n                output_stream = self.model.generate_stream(\n                    input_messages,\n                    stop_sequences=stop_sequences,\n                    **additional_args,\n                )\n                chat_message_stream_deltas: list[ChatMessageStreamDelta] = []\n                with Live(\"\", console=self.logger.console, vertical_overflow=\"visible\") as live:\n                    for event in output_stream:\n                        chat_message_stream_deltas.append(event)\n                        live.update(\n                            Markdown(agglomerate_stream_deltas(chat_message_stream_deltas).render_as_markdown())\n                        )\n                        yield event\n                chat_message = agglomerate_stream_deltas(chat_message_stream_deltas)\n                memory_step.model_output_message = chat_message\n                output_text = chat_message.content\n            else:\n                chat_message: ChatMessage = self.model.generate(\n                    input_messages,\n                    stop_sequences=stop_sequences,\n                    **additional_args,\n                )\n                memory_step.model_output_message = chat_message\n                output_text = chat_message.content\n                self.logger.log_markdown(\n                    content=output_text or \"\",\n                    title=\"Output message of the LLM:\",\n                    level=LogLevel.DEBUG,\n                )\n\n            if not self._use_structured_outputs_internally:\n                # This adds the end code sequence (i.e. the closing code block tag) to the history.\n                # This will nudge subsequent LLM calls to finish with this end code sequence, thus efficiently stopping generation.\n                if output_text and not output_text.strip().endswith(self.code_block_tags[1]):\n                    output_text += self.code_block_tags[1]\n                    memory_step.model_output_message.content = output_text\n\n            memory_step.token_usage = chat_message.token_usage\n            memory_step.model_output = output_text\n        except Exception as e:\n            raise AgentGenerationError(f\"Error in generating model output:\\n{e}\", self.logger) from e\n\n        ### Parse output ###\n        try:\n            if self._use_structured_outputs_internally:\n                code_action = json.loads(output_text)[\"code\"]\n                code_action = extract_code_from_text(code_action, self.code_block_tags) or code_action\n            else:\n                code_action = parse_code_blobs(output_text, self.code_block_tags)\n            code_action = fix_final_answer_code(code_action)\n            memory_step.code_action = code_action\n        except Exception as e:\n            error_msg = f\"Error in code parsing:\\n{e}\\nMake sure to provide correct code blobs.\"\n            raise AgentParsingError(error_msg, self.logger)\n\n        tool_call = ToolCall(\n            name=\"python_interpreter\",\n            arguments=code_action,\n            id=f\"call_{len(self.memory.steps)}\",\n        )\n        yield tool_call\n        memory_step.tool_calls = [tool_call]\n\n        ### Execute action ###\n        self.logger.log_code(title=\"Executing parsed code:\", content=code_action, level=LogLevel.INFO)\n        try:\n            code_output = self.python_executor(code_action)\n            execution_outputs_console = []\n            if len(code_output.logs) > 0:\n                execution_outputs_console += [\n                    Text(\"Execution logs:\", style=\"bold\"),\n                    Text(code_output.logs),\n                ]\n            observation = \"Execution logs:\\n\" + code_output.logs\n        except Exception as e:\n            if hasattr(self.python_executor, \"state\") and \"_print_outputs\" in self.python_executor.state:\n                execution_logs = str(self.python_executor.state[\"_print_outputs\"])\n                if len(execution_logs) > 0:\n                    execution_outputs_console = [\n                        Text(\"Execution logs:\", style=\"bold\"),\n                        Text(execution_logs),\n                    ]\n                    memory_step.observations = \"Execution logs:\\n\" + execution_logs\n                    self.logger.log(Group(*execution_outputs_console), level=LogLevel.INFO)\n            error_msg = str(e)\n            if \"Import of \" in error_msg and \" is not allowed\" in error_msg:\n                self.logger.log(\n                    \"[bold red]Warning to user: Code execution failed due to an unauthorized import - Consider passing said import under `additional_authorized_imports` when initializing your CodeAgent.\",\n                    level=LogLevel.INFO,\n                )\n            raise AgentExecutionError(error_msg, self.logger)\n\n        truncated_output = truncate_content(str(code_output.output))\n        observation += \"Last output from code snippet:\\n\" + truncated_output\n        memory_step.observations = observation\n\n        if not code_output.is_final_answer:\n            execution_outputs_console += [\n                Text(\n                    f\"Out: {truncated_output}\",\n                ),\n            ]\n        self.logger.log(Group(*execution_outputs_console), level=LogLevel.INFO)\n        memory_step.action_output = code_output.output\n        yield ActionOutput(output=code_output.output, is_final_answer=code_output.is_final_answer)\n\n    def to_dict(self) -> dict[str, Any]:\n        \"\"\"Convert the agent to a dictionary representation.\n\n        Returns:\n            `dict`: Dictionary representation of the agent.\n        \"\"\"\n        agent_dict = super().to_dict()\n        agent_dict[\"authorized_imports\"] = self.authorized_imports\n        agent_dict[\"executor_type\"] = self.executor_type\n        agent_dict[\"executor_kwargs\"] = self.executor_kwargs\n        agent_dict[\"max_print_outputs_length\"] = self.max_print_outputs_length\n        return agent_dict\n\n    @classmethod\n    def from_dict(cls, agent_dict: dict[str, Any], **kwargs) -> \"CodeAgent\":\n        \"\"\"Create CodeAgent from a dictionary representation.\n\n        Args:\n            agent_dict (`dict[str, Any]`): Dictionary representation of the agent.\n            **kwargs: Additional keyword arguments that will override agent_dict values.\n\n        Returns:\n            `CodeAgent`: Instance of the CodeAgent class.\n        \"\"\"\n        # Add CodeAgent-specific parameters to kwargs\n        code_agent_kwargs = {\n            \"additional_authorized_imports\": agent_dict.get(\"authorized_imports\"),\n            \"executor_type\": agent_dict.get(\"executor_type\"),\n            \"executor_kwargs\": agent_dict.get(\"executor_kwargs\"),\n            \"max_print_outputs_length\": agent_dict.get(\"max_print_outputs_length\"),\n            \"code_block_tags\": agent_dict.get(\"code_block_tags\"),\n        }\n        # Filter out None values\n        code_agent_kwargs = {k: v for k, v in code_agent_kwargs.items() if v is not None}\n        # Update with any additional kwargs\n        code_agent_kwargs.update(kwargs)\n        # Call the parent class's from_dict method\n        return super().from_dict(agent_dict, **code_agent_kwargs)\n\n\n# Agent Registry for secure deserialization\n# This registry maps agent class names to their actual classes.\n# Only classes listed here can be instantiated during deserialization (from_dict/from_folder).\n# This prevents arbitrary code execution via importlib-based dynamic loading.\nAGENT_REGISTRY = {\n    \"ToolCallingAgent\": ToolCallingAgent,\n    \"CodeAgent\": CodeAgent,\n}\n"
  },
  {
    "path": "src/smolagents/cli.py",
    "content": "#!/usr/bin/env python\n# coding=utf-8\n\n# Copyright 2025 The HuggingFace Inc. team. All rights reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\nimport argparse\nimport os\n\nfrom dotenv import load_dotenv\nfrom rich.console import Console\nfrom rich.panel import Panel\nfrom rich.prompt import Confirm, Prompt\nfrom rich.rule import Rule\nfrom rich.table import Table\n\nfrom smolagents import (\n    CodeAgent,\n    InferenceClientModel,\n    LiteLLMModel,\n    Model,\n    OpenAIModel,\n    Tool,\n    ToolCallingAgent,\n    TransformersModel,\n)\nfrom smolagents.default_tools import TOOL_MAPPING\n\n\nconsole = Console()\n\nleopard_prompt = \"How many seconds would it take for a leopard at full speed to run through Pont des Arts?\"\n\n\ndef parse_arguments():\n    parser = argparse.ArgumentParser(description=\"Run a CodeAgent with all specified parameters\")\n    parser.add_argument(\n        \"prompt\",\n        type=str,\n        nargs=\"?\",\n        default=None,\n        help=\"The prompt to run with the agent. If no prompt is provided, interactive mode will be launched to guide user through agent setup\",\n    )\n    parser.add_argument(\n        \"--model-type\",\n        type=str,\n        default=\"InferenceClientModel\",\n        help=\"The model type to use (e.g., InferenceClientModel, OpenAIModel, LiteLLMModel, TransformersModel)\",\n    )\n    parser.add_argument(\n        \"--action-type\",\n        type=str,\n        default=\"code\",\n        help=\"The action type to use (e.g., code, tool_calling)\",\n    )\n    parser.add_argument(\n        \"--model-id\",\n        type=str,\n        default=\"Qwen/Qwen3-Next-80B-A3B-Thinking\",\n        help=\"The model ID to use for the specified model type\",\n    )\n    parser.add_argument(\n        \"--imports\",\n        nargs=\"*\",  # accepts zero or more arguments\n        default=[],\n        help=\"Space-separated list of imports to authorize (e.g., 'numpy pandas')\",\n    )\n    parser.add_argument(\n        \"--tools\",\n        nargs=\"*\",\n        default=[\"web_search\"],\n        help=\"Space-separated list of tools that the agent can use (e.g., 'tool1 tool2 tool3')\",\n    )\n    parser.add_argument(\n        \"--verbosity-level\",\n        type=int,\n        default=1,\n        help=\"The verbosity level, as an int in [0, 1, 2].\",\n    )\n    group = parser.add_argument_group(\"api options\", \"Options for API-based model types\")\n    group.add_argument(\n        \"--provider\",\n        type=str,\n        default=None,\n        help=\"The inference provider to use for the model\",\n    )\n    group.add_argument(\n        \"--api-base\",\n        type=str,\n        help=\"The base URL for the model\",\n    )\n    group.add_argument(\n        \"--api-key\",\n        type=str,\n        help=\"The API key for the model\",\n    )\n    return parser.parse_args()\n\n\ndef interactive_mode():\n    \"\"\"Run the CLI in interactive mode\"\"\"\n    console.print(\n        Panel.fit(\n            \"[bold magenta]🤖 SmolaGents CLI[/]\\n[dim]Intelligent agents at your service[/]\", border_style=\"magenta\"\n        )\n    )\n\n    console.print(\"\\n[bold yellow]Welcome to smolagents![/] Let's set up your agent step by step.\\n\")\n\n    # Get user input step by step\n    console.print(Rule(\"[bold yellow]⚙️  Configuration\", style=\"bold yellow\"))\n\n    # Get agent action type\n    action_type = Prompt.ask(\n        \"[bold white]What action type would you like to use? 'code' or 'tool_calling'?[/]\",\n        default=\"code\",\n        choices=[\"code\", \"tool_calling\"],\n    )\n\n    # Show available tools\n    tools_table = Table(title=\"[bold yellow]🛠️  Available Tools\", show_header=True, header_style=\"bold yellow\")\n    tools_table.add_column(\"Tool Name\", style=\"bold yellow\")\n    tools_table.add_column(\"Description\", style=\"white\")\n\n    for tool_name, tool_class in TOOL_MAPPING.items():\n        # Get description from the tool class if available\n        try:\n            tool_instance = tool_class()\n            description = getattr(tool_instance, \"description\", \"No description available\")\n        except Exception:\n            description = \"Built-in tool\"\n        tools_table.add_row(tool_name, description)\n\n    console.print(tools_table)\n    console.print(\n        \"\\n[dim]You can also use HuggingFace Spaces by providing the full path (e.g., 'username/spacename')[/]\"\n    )\n\n    console.print(\"[dim]Enter tool names separated by spaces (e.g., 'web_search python_interpreter')[/]\")\n    tools_input = Prompt.ask(\"[bold white]Select tools for your agent[/]\", default=\"web_search\")\n    tools = tools_input.split()\n\n    # Get model configuration\n    console.print(\"\\n[bold yellow]Model Configuration:[/]\")\n    model_type = Prompt.ask(\n        \"[bold]Model type[/]\",\n        default=\"InferenceClientModel\",\n        choices=[\"InferenceClientModel\", \"OpenAIServerModel\", \"LiteLLMModel\", \"TransformersModel\"],\n    )\n\n    model_id = Prompt.ask(\"[bold white]Model ID[/]\", default=\"Qwen/Qwen2.5-Coder-32B-Instruct\")\n\n    # Optional configurations\n    provider = None\n    api_base = None\n    api_key = None\n    imports = []\n    action_type = \"code\"\n\n    if Confirm.ask(\"\\n[bold white]Configure advanced options?[/]\", default=False):\n        if model_type in [\"InferenceClientModel\", \"OpenAIServerModel\", \"LiteLLMModel\"]:\n            provider = Prompt.ask(\"[bold white]Provider[/]\", default=\"\")\n            api_base = Prompt.ask(\"[bold white]API Base URL[/]\", default=\"\")\n            api_key = Prompt.ask(\"[bold white]API Key[/]\", default=\"\", password=True)\n\n        imports_input = Prompt.ask(\"[bold white]Additional imports (space-separated)[/]\", default=\"\")\n        if imports_input:\n            imports = imports_input.split()\n\n    # Get prompt\n    prompt = Prompt.ask(\n        \"[bold white]Now the final step; what task would you like the agent to perform?[/]\", default=leopard_prompt\n    )\n\n    return prompt, tools, model_type, model_id, provider, api_base, api_key, imports, action_type\n\n\ndef load_model(\n    model_type: str,\n    model_id: str,\n    api_base: str | None = None,\n    api_key: str | None = None,\n    provider: str | None = None,\n) -> Model:\n    if model_type == \"OpenAIModel\":\n        return OpenAIModel(\n            api_key=api_key or os.getenv(\"FIREWORKS_API_KEY\"),\n            api_base=api_base or \"https://api.fireworks.ai/inference/v1\",\n            model_id=model_id,\n        )\n    elif model_type == \"LiteLLMModel\":\n        return LiteLLMModel(\n            model_id=model_id,\n            api_key=api_key,\n            api_base=api_base,\n        )\n    elif model_type == \"TransformersModel\":\n        return TransformersModel(model_id=model_id, device_map=\"auto\")\n    elif model_type == \"InferenceClientModel\":\n        return InferenceClientModel(\n            model_id=model_id,\n            token=api_key or os.getenv(\"HF_API_KEY\"),\n            provider=provider,\n        )\n    else:\n        raise ValueError(f\"Unsupported model type: {model_type}\")\n\n\ndef run_smolagent(\n    prompt: str,\n    tools: list[str],\n    model_type: str,\n    model_id: str,\n    api_base: str | None = None,\n    api_key: str | None = None,\n    imports: list[str] | None = None,\n    provider: str | None = None,\n    action_type: str = \"code\",\n) -> None:\n    load_dotenv()\n\n    model = load_model(model_type, model_id, api_base=api_base, api_key=api_key, provider=provider)\n\n    available_tools = []\n\n    for tool_name in tools:\n        if \"/\" in tool_name:\n            space_name = tool_name.split(\"/\")[-1].lower().replace(\"-\", \"_\").replace(\".\", \"_\")\n            description = f\"Tool loaded from Hugging Face Space: {tool_name}\"\n            available_tools.append(Tool.from_space(space_id=tool_name, name=space_name, description=description))\n        else:\n            if tool_name in TOOL_MAPPING:\n                available_tools.append(TOOL_MAPPING[tool_name]())\n            else:\n                raise ValueError(f\"Tool {tool_name} is not recognized either as a default tool or a Space.\")\n\n    if action_type == \"code\":\n        agent = CodeAgent(\n            tools=available_tools,\n            model=model,\n            additional_authorized_imports=imports,\n            stream_outputs=True,\n        )\n    elif action_type == \"tool_calling\":\n        agent = ToolCallingAgent(tools=available_tools, model=model, stream_outputs=True)\n    else:\n        raise ValueError(f\"Unsupported action type: {action_type}\")\n\n    agent.run(prompt)\n\n\ndef main() -> None:\n    args = parse_arguments()\n\n    # Check if we should run in interactive mode\n    # Interactive mode is triggered when no prompt is provided\n    if args.prompt is None:\n        prompt, tools, model_type, model_id, provider, api_base, api_key, imports, action_type = interactive_mode()\n    else:\n        prompt = args.prompt\n        tools = args.tools\n        model_type = args.model_type\n        model_id = args.model_id\n        provider = args.provider\n        api_base = args.api_base\n        api_key = args.api_key\n        imports = args.imports\n        action_type = args.action_type\n\n    run_smolagent(\n        prompt,\n        tools,\n        model_type,\n        model_id,\n        provider=provider,\n        api_base=api_base,\n        api_key=api_key,\n        imports=imports,\n        action_type=action_type,\n    )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "src/smolagents/default_tools.py",
    "content": "#!/usr/bin/env python\n# coding=utf-8\n\n# Copyright 2024 The HuggingFace Inc. team. All rights reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\nfrom dataclasses import dataclass\nfrom typing import Any\n\nfrom .local_python_executor import (\n    BASE_BUILTIN_MODULES,\n    BASE_PYTHON_TOOLS,\n    MAX_EXECUTION_TIME_SECONDS,\n    evaluate_python_code,\n)\nfrom .tools import PipelineTool, Tool\n\n\n@dataclass\nclass PreTool:\n    name: str\n    inputs: dict[str, str]\n    output_type: type\n    task: str\n    description: str\n    repo_id: str\n\n\nclass PythonInterpreterTool(Tool):\n    name = \"python_interpreter\"\n    description = \"This is a tool that evaluates python code. It can be used to perform calculations.\"\n    inputs = {\n        \"code\": {\n            \"type\": \"string\",\n            \"description\": \"The python code to run in interpreter\",\n        }\n    }\n    output_type = \"string\"\n\n    def __init__(self, *args, authorized_imports=None, timeout_seconds=MAX_EXECUTION_TIME_SECONDS, **kwargs):\n        if authorized_imports is None:\n            self.authorized_imports = list(set(BASE_BUILTIN_MODULES))\n        else:\n            self.authorized_imports = list(set(BASE_BUILTIN_MODULES) | set(authorized_imports))\n        self.inputs = {\n            \"code\": {\n                \"type\": \"string\",\n                \"description\": (\n                    \"The code snippet to evaluate. All variables used in this snippet must be defined in this same snippet, \"\n                    f\"else you will get an error. This code can only import the following python libraries: {self.authorized_imports}.\"\n                ),\n            }\n        }\n        self.base_python_tools = BASE_PYTHON_TOOLS\n        self.python_evaluator = evaluate_python_code\n        self.timeout_seconds = timeout_seconds\n        super().__init__(*args, **kwargs)\n\n    def forward(self, code: str) -> str:\n        state = {}\n        output = str(\n            self.python_evaluator(\n                code,\n                state=state,\n                static_tools=self.base_python_tools,\n                authorized_imports=self.authorized_imports,\n                timeout_seconds=self.timeout_seconds,\n            )[0]  # The second element is boolean is_final_answer\n        )\n        return f\"Stdout:\\n{str(state['_print_outputs'])}\\nOutput: {output}\"\n\n\nclass FinalAnswerTool(Tool):\n    name = \"final_answer\"\n    description = \"Provides a final answer to the given problem.\"\n    inputs = {\"answer\": {\"type\": \"any\", \"description\": \"The final answer to the problem\"}}\n    output_type = \"any\"\n\n    def forward(self, answer: Any) -> Any:\n        return answer\n\n\nclass UserInputTool(Tool):\n    name = \"user_input\"\n    description = \"Asks for user's input on a specific question\"\n    inputs = {\"question\": {\"type\": \"string\", \"description\": \"The question to ask the user\"}}\n    output_type = \"string\"\n\n    def forward(self, question):\n        user_input = input(f\"{question} => Type your answer here:\")\n        return user_input\n\n\nclass DuckDuckGoSearchTool(Tool):\n    \"\"\"Web search tool that performs searches using the DuckDuckGo search engine.\n\n    Args:\n        max_results (`int`, default `10`): Maximum number of search results to return.\n        rate_limit (`float`, default `1.0`): Maximum queries per second. Set to `None` to disable rate limiting.\n        **kwargs: Additional keyword arguments for the `DDGS` client.\n\n    Examples:\n        ```python\n        >>> from smolagents import DuckDuckGoSearchTool\n        >>> web_search_tool = DuckDuckGoSearchTool(max_results=5, rate_limit=2.0)\n        >>> results = web_search_tool(\"Hugging Face\")\n        >>> print(results)\n        ```\n    \"\"\"\n\n    name = \"web_search\"\n    description = \"\"\"Performs a duckduckgo web search based on your query (think a Google search) then returns the top search results.\"\"\"\n    inputs = {\"query\": {\"type\": \"string\", \"description\": \"The search query to perform.\"}}\n    output_type = \"string\"\n\n    def __init__(self, max_results: int = 10, rate_limit: float | None = 1.0, **kwargs):\n        super().__init__()\n        self.max_results = max_results\n        self.rate_limit = rate_limit\n        self._min_interval = 1.0 / rate_limit if rate_limit else 0.0\n        self._last_request_time = 0.0\n        try:\n            from ddgs import DDGS\n        except ImportError as e:\n            raise ImportError(\n                \"You must install package `ddgs` to run this tool: for instance run `pip install ddgs`.\"\n            ) from e\n        self.ddgs = DDGS(**kwargs)\n\n    def forward(self, query: str) -> str:\n        self._enforce_rate_limit()\n        results = self.ddgs.text(query, max_results=self.max_results)\n        if len(results) == 0:\n            raise Exception(\"No results found! Try a less restrictive/shorter query.\")\n        postprocessed_results = [f\"[{result['title']}]({result['href']})\\n{result['body']}\" for result in results]\n        return \"## Search Results\\n\\n\" + \"\\n\\n\".join(postprocessed_results)\n\n    def _enforce_rate_limit(self) -> None:\n        import time\n\n        # No rate limit enforced\n        if not self.rate_limit:\n            return\n\n        now = time.time()\n        elapsed = now - self._last_request_time\n        if elapsed < self._min_interval:\n            time.sleep(self._min_interval - elapsed)\n        self._last_request_time = time.time()\n\n\nclass GoogleSearchTool(Tool):\n    name = \"web_search\"\n    description = \"\"\"Performs a google web search for your query then returns a string of the top search results.\"\"\"\n    inputs = {\n        \"query\": {\"type\": \"string\", \"description\": \"The search query to perform.\"},\n        \"filter_year\": {\n            \"type\": \"integer\",\n            \"description\": \"Optionally restrict results to a certain year\",\n            \"nullable\": True,\n        },\n    }\n    output_type = \"string\"\n\n    def __init__(self, provider: str = \"serpapi\"):\n        super().__init__()\n        import os\n\n        self.provider = provider\n        if provider == \"serpapi\":\n            self.organic_key = \"organic_results\"\n            api_key_env_name = \"SERPAPI_API_KEY\"\n        else:\n            self.organic_key = \"organic\"\n            api_key_env_name = \"SERPER_API_KEY\"\n        self.api_key = os.getenv(api_key_env_name)\n        if self.api_key is None:\n            raise ValueError(f\"Missing API key. Make sure you have '{api_key_env_name}' in your env variables.\")\n\n    def forward(self, query: str, filter_year: int | None = None) -> str:\n        import requests\n\n        if self.provider == \"serpapi\":\n            params = {\n                \"q\": query,\n                \"api_key\": self.api_key,\n                \"engine\": \"google\",\n                \"google_domain\": \"google.com\",\n            }\n            base_url = \"https://serpapi.com/search.json\"\n        else:\n            params = {\n                \"q\": query,\n                \"api_key\": self.api_key,\n            }\n            base_url = \"https://google.serper.dev/search\"\n        if filter_year is not None:\n            params[\"tbs\"] = f\"cdr:1,cd_min:01/01/{filter_year},cd_max:12/31/{filter_year}\"\n\n        response = requests.get(base_url, params=params)\n\n        if response.status_code == 200:\n            results = response.json()\n        else:\n            raise ValueError(response.json())\n\n        if self.organic_key not in results.keys():\n            if filter_year is not None:\n                raise Exception(\n                    f\"No results found for query: '{query}' with filtering on year={filter_year}. Use a less restrictive query or do not filter on year.\"\n                )\n            else:\n                raise Exception(f\"No results found for query: '{query}'. Use a less restrictive query.\")\n        if len(results[self.organic_key]) == 0:\n            year_filter_message = f\" with filter year={filter_year}\" if filter_year is not None else \"\"\n            return f\"No results found for '{query}'{year_filter_message}. Try with a more general query, or remove the year filter.\"\n\n        web_snippets = []\n        if self.organic_key in results:\n            for idx, page in enumerate(results[self.organic_key]):\n                date_published = \"\"\n                if \"date\" in page:\n                    date_published = \"\\nDate published: \" + page[\"date\"]\n\n                source = \"\"\n                if \"source\" in page:\n                    source = \"\\nSource: \" + page[\"source\"]\n\n                snippet = \"\"\n                if \"snippet\" in page:\n                    snippet = \"\\n\" + page[\"snippet\"]\n\n                redacted_version = f\"{idx}. [{page['title']}]({page['link']}){date_published}{source}\\n{snippet}\"\n                web_snippets.append(redacted_version)\n\n        return \"## Search Results\\n\" + \"\\n\\n\".join(web_snippets)\n\n\nclass ApiWebSearchTool(Tool):\n    \"\"\"Web search tool that performs API-based searches.\n    By default, it uses the Brave Search API.\n\n    This tool implements a rate limiting mechanism to ensure compliance with API usage policies.\n    By default, it limits requests to 1 query per second.\n\n    Args:\n        endpoint (`str`): API endpoint URL. Defaults to Brave Search API.\n        api_key (`str`): API key for authentication.\n        api_key_name (`str`): Environment variable name containing the API key. Defaults to \"BRAVE_API_KEY\".\n        headers (`dict`, *optional*): Headers for API requests.\n        params (`dict`, *optional*): Parameters for API requests.\n        rate_limit (`float`, default `1.0`): Maximum queries per second. Set to `None` to disable rate limiting.\n\n    Examples:\n        ```python\n        >>> from smolagents import ApiWebSearchTool\n        >>> web_search_tool = ApiWebSearchTool(rate_limit=50.0)\n        >>> results = web_search_tool(\"Hugging Face\")\n        >>> print(results)\n        ```\n    \"\"\"\n\n    name = \"web_search\"\n    description = \"Performs a web search for a query and returns a string of the top search results formatted as markdown with titles, URLs, and descriptions.\"\n    inputs = {\"query\": {\"type\": \"string\", \"description\": \"The search query to perform.\"}}\n    output_type = \"string\"\n\n    def __init__(\n        self,\n        endpoint: str = \"\",\n        api_key: str = \"\",\n        api_key_name: str = \"\",\n        headers: dict = None,\n        params: dict = None,\n        rate_limit: float | None = 1.0,\n    ):\n        import os\n\n        super().__init__()\n        self.endpoint = endpoint or \"https://api.search.brave.com/res/v1/web/search\"\n        self.api_key_name = api_key_name or \"BRAVE_API_KEY\"\n        self.api_key = api_key or os.getenv(self.api_key_name)\n        self.headers = headers or {\"X-Subscription-Token\": self.api_key}\n        self.params = params or {\"count\": 10}\n        self.rate_limit = rate_limit\n        self._min_interval = 1.0 / rate_limit if rate_limit else 0.0\n        self._last_request_time = 0.0\n\n    def _enforce_rate_limit(self) -> None:\n        import time\n\n        # No rate limit enforced\n        if not self.rate_limit:\n            return\n\n        now = time.time()\n        elapsed = now - self._last_request_time\n        if elapsed < self._min_interval:\n            time.sleep(self._min_interval - elapsed)\n        self._last_request_time = time.time()\n\n    def forward(self, query: str) -> str:\n        import requests\n\n        self._enforce_rate_limit()\n        params = {**self.params, \"q\": query}\n        response = requests.get(self.endpoint, headers=self.headers, params=params)\n        response.raise_for_status()\n        data = response.json()\n        results = self.extract_results(data)\n        return self.format_markdown(results)\n\n    def extract_results(self, data: dict) -> list:\n        results = []\n        for result in data.get(\"web\", {}).get(\"results\", []):\n            results.append(\n                {\"title\": result[\"title\"], \"url\": result[\"url\"], \"description\": result.get(\"description\", \"\")}\n            )\n        return results\n\n    def format_markdown(self, results: list) -> str:\n        if not results:\n            return \"No results found.\"\n        return \"## Search Results\\n\\n\" + \"\\n\\n\".join(\n            [\n                f\"{idx}. [{result['title']}]({result['url']})\\n{result['description']}\"\n                for idx, result in enumerate(results, start=1)\n            ]\n        )\n\n\nclass WebSearchTool(Tool):\n    name = \"web_search\"\n    description = \"Performs a web search for a query and returns a string of the top search results formatted as markdown with titles, links, and descriptions.\"\n    inputs = {\"query\": {\"type\": \"string\", \"description\": \"The search query to perform.\"}}\n    output_type = \"string\"\n\n    def __init__(self, max_results: int = 10, engine: str = \"duckduckgo\"):\n        super().__init__()\n        self.max_results = max_results\n        self.engine = engine\n\n    def forward(self, query: str) -> str:\n        results = self.search(query)\n        if len(results) == 0:\n            raise Exception(\"No results found! Try a less restrictive/shorter query.\")\n        return self.parse_results(results)\n\n    def search(self, query: str) -> list:\n        if self.engine == \"duckduckgo\":\n            return self.search_duckduckgo(query)\n        elif self.engine == \"bing\":\n            return self.search_bing(query)\n        else:\n            raise ValueError(f\"Unsupported engine: {self.engine}\")\n\n    def parse_results(self, results: list) -> str:\n        return \"## Search Results\\n\\n\" + \"\\n\\n\".join(\n            [f\"[{result['title']}]({result['link']})\\n{result['description']}\" for result in results]\n        )\n\n    def search_duckduckgo(self, query: str) -> list:\n        import requests\n\n        response = requests.get(\n            \"https://lite.duckduckgo.com/lite/\",\n            params={\"q\": query},\n            headers={\"User-Agent\": \"Mozilla/5.0\"},\n        )\n        response.raise_for_status()\n        parser = self._create_duckduckgo_parser()\n        parser.feed(response.text)\n        return parser.results\n\n    def _create_duckduckgo_parser(self):\n        from html.parser import HTMLParser\n\n        class SimpleResultParser(HTMLParser):\n            def __init__(self):\n                super().__init__()\n                self.results = []\n                self.current = {}\n                self.capture_title = False\n                self.capture_description = False\n                self.capture_link = False\n\n            def handle_starttag(self, tag, attrs):\n                attrs = dict(attrs)\n                if tag == \"a\" and attrs.get(\"class\") == \"result-link\":\n                    self.capture_title = True\n                elif tag == \"td\" and attrs.get(\"class\") == \"result-snippet\":\n                    self.capture_description = True\n                elif tag == \"span\" and attrs.get(\"class\") == \"link-text\":\n                    self.capture_link = True\n\n            def handle_endtag(self, tag):\n                if tag == \"a\" and self.capture_title:\n                    self.capture_title = False\n                elif tag == \"td\" and self.capture_description:\n                    self.capture_description = False\n                elif tag == \"span\" and self.capture_link:\n                    self.capture_link = False\n                elif tag == \"tr\":\n                    # Store current result if all parts are present\n                    if {\"title\", \"description\", \"link\"} <= self.current.keys():\n                        self.current[\"description\"] = \" \".join(self.current[\"description\"])\n                        self.results.append(self.current)\n                        self.current = {}\n\n            def handle_data(self, data):\n                if self.capture_title:\n                    self.current[\"title\"] = data.strip()\n                elif self.capture_description:\n                    self.current.setdefault(\"description\", [])\n                    self.current[\"description\"].append(data.strip())\n                elif self.capture_link:\n                    self.current[\"link\"] = \"https://\" + data.strip()\n\n        return SimpleResultParser()\n\n    def search_bing(self, query: str) -> list:\n        import xml.etree.ElementTree as ET\n\n        import requests\n\n        response = requests.get(\n            \"https://www.bing.com/search\",\n            params={\"q\": query, \"format\": \"rss\"},\n        )\n        response.raise_for_status()\n        root = ET.fromstring(response.text)\n        items = root.findall(\".//item\")\n        results = [\n            {\n                \"title\": item.findtext(\"title\"),\n                \"link\": item.findtext(\"link\"),\n                \"description\": item.findtext(\"description\"),\n            }\n            for item in items[: self.max_results]\n        ]\n        return results\n\n\nclass VisitWebpageTool(Tool):\n    name = \"visit_webpage\"\n    description = (\n        \"Visits a webpage at the given url and reads its content as a markdown string. Use this to browse webpages.\"\n    )\n    inputs = {\n        \"url\": {\n            \"type\": \"string\",\n            \"description\": \"The url of the webpage to visit.\",\n        }\n    }\n    output_type = \"string\"\n\n    def __init__(self, max_output_length: int = 40000):\n        super().__init__()\n        self.max_output_length = max_output_length\n\n    def _truncate_content(self, content: str, max_length: int) -> str:\n        if len(content) <= max_length:\n            return content\n        return (\n            content[:max_length] + f\"\\n..._This content has been truncated to stay below {max_length} characters_...\\n\"\n        )\n\n    def forward(self, url: str) -> str:\n        try:\n            import re\n\n            import requests\n            from markdownify import markdownify\n            from requests.exceptions import RequestException\n        except ImportError as e:\n            raise ImportError(\n                \"You must install packages `markdownify` and `requests` to run this tool: for instance run `pip install markdownify requests`.\"\n            ) from e\n        try:\n            # Send a GET request to the URL with a 20-second timeout\n            response = requests.get(url, timeout=20)\n            response.raise_for_status()  # Raise an exception for bad status codes\n\n            # Convert the HTML content to Markdown\n            markdown_content = markdownify(response.text).strip()\n\n            # Remove multiple line breaks\n            markdown_content = re.sub(r\"\\n{3,}\", \"\\n\\n\", markdown_content)\n\n            return self._truncate_content(markdown_content, self.max_output_length)\n\n        except requests.exceptions.Timeout:\n            return \"The request timed out. Please try again later or check the URL.\"\n        except RequestException as e:\n            return f\"Error fetching the webpage: {str(e)}\"\n        except Exception as e:\n            return f\"An unexpected error occurred: {str(e)}\"\n\n\nclass WikipediaSearchTool(Tool):\n    \"\"\"\n    Search Wikipedia and return the summary or full text of the requested article, along with the page URL.\n\n    Attributes:\n        user_agent (`str`): Custom user-agent string to identify the project. This is required as per Wikipedia API policies.\n            See: https://foundation.wikimedia.org/wiki/Policy:Wikimedia_Foundation_User-Agent_Policy\n        language (`str`, default `\"en\"`): Language in which to retrieve Wikipedia article.\n            See: http://meta.wikimedia.org/wiki/List_of_Wikipedias\n        content_type (`Literal[\"summary\", \"text\"]`, default `\"text\"`): Type of content to fetch. Can be \"summary\" for a short summary or \"text\" for the full article.\n        extract_format (`Literal[\"HTML\", \"WIKI\"]`, default `\"WIKI\"`): Extraction format of the output. Can be `\"WIKI\"` or `\"HTML\"`.\n\n    Example:\n        ```python\n        >>> from smolagents import CodeAgent, InferenceClientModel, WikipediaSearchTool\n        >>> agent = CodeAgent(\n        >>>     tools=[\n        >>>            WikipediaSearchTool(\n        >>>                user_agent=\"MyResearchBot (myemail@example.com)\",\n        >>>                language=\"en\",\n        >>>                content_type=\"summary\",  # or \"text\"\n        >>>                extract_format=\"WIKI\",\n        >>>            )\n        >>>        ],\n        >>>     model=InferenceClientModel(),\n        >>> )\n        >>> agent.run(\"Python_(programming_language)\")\n        ```\n    \"\"\"\n\n    name = \"wikipedia_search\"\n    description = \"Searches Wikipedia and returns a summary or full text of the given topic, along with the page URL.\"\n    inputs = {\n        \"query\": {\n            \"type\": \"string\",\n            \"description\": \"The topic to search on Wikipedia.\",\n        }\n    }\n    output_type = \"string\"\n\n    def __init__(\n        self,\n        user_agent: str = \"Smolagents (myemail@example.com)\",\n        language: str = \"en\",\n        content_type: str = \"text\",\n        extract_format: str = \"WIKI\",\n    ):\n        super().__init__()\n        try:\n            import wikipediaapi\n        except ImportError as e:\n            raise ImportError(\n                \"You must install `wikipedia-api` to run this tool: for instance run `pip install wikipedia-api`\"\n            ) from e\n        if not user_agent:\n            raise ValueError(\"User-agent is required. Provide a meaningful identifier for your project.\")\n\n        self.user_agent = user_agent\n        self.language = language\n        self.content_type = content_type\n\n        # Map string format to wikipediaapi.ExtractFormat\n        extract_format_map = {\n            \"WIKI\": wikipediaapi.ExtractFormat.WIKI,\n            \"HTML\": wikipediaapi.ExtractFormat.HTML,\n        }\n\n        if extract_format not in extract_format_map:\n            raise ValueError(\"Invalid extract_format. Choose between 'WIKI' or 'HTML'.\")\n\n        self.extract_format = extract_format_map[extract_format]\n\n        self.wiki = wikipediaapi.Wikipedia(\n            user_agent=self.user_agent, language=self.language, extract_format=self.extract_format\n        )\n\n    def forward(self, query: str) -> str:\n        try:\n            page = self.wiki.page(query)\n\n            if not page.exists():\n                return f\"No Wikipedia page found for '{query}'. Try a different query.\"\n\n            title = page.title\n            url = page.fullurl\n\n            if self.content_type == \"summary\":\n                text = page.summary\n            elif self.content_type == \"text\":\n                text = page.text\n            else:\n                return \"⚠️ Invalid `content_type`. Use either 'summary' or 'text'.\"\n\n            return f\"✅ **Wikipedia Page:** {title}\\n\\n**Content:** {text}\\n\\n🔗 **Read more:** {url}\"\n\n        except Exception as e:\n            return f\"Error fetching Wikipedia summary: {str(e)}\"\n\n\nclass SpeechToTextTool(PipelineTool):\n    default_checkpoint = \"openai/whisper-large-v3-turbo\"\n    description = \"This is a tool that transcribes an audio into text. It returns the transcribed text.\"\n    name = \"transcriber\"\n    inputs = {\n        \"audio\": {\n            \"type\": \"audio\",\n            \"description\": \"The audio to transcribe. Can be a local path, an url, or a tensor.\",\n        }\n    }\n    output_type = \"string\"\n\n    def __new__(cls, *args, **kwargs):\n        from transformers.models.whisper import WhisperForConditionalGeneration, WhisperProcessor\n\n        cls.pre_processor_class = WhisperProcessor\n        cls.model_class = WhisperForConditionalGeneration\n        return super().__new__(cls)\n\n    def encode(self, audio):\n        from .agent_types import AgentAudio\n\n        audio = AgentAudio(audio).to_raw()\n        return self.pre_processor(audio, return_tensors=\"pt\")\n\n    def forward(self, inputs):\n        return self.model.generate(inputs[\"input_features\"])\n\n    def decode(self, outputs):\n        return self.pre_processor.batch_decode(outputs, skip_special_tokens=True)[0]\n\n\nTOOL_MAPPING = {\n    tool_class.name: tool_class\n    for tool_class in [\n        PythonInterpreterTool,\n        DuckDuckGoSearchTool,\n        VisitWebpageTool,\n    ]\n}\n\n__all__ = [\n    \"ApiWebSearchTool\",\n    \"PythonInterpreterTool\",\n    \"FinalAnswerTool\",\n    \"UserInputTool\",\n    \"WebSearchTool\",\n    \"DuckDuckGoSearchTool\",\n    \"GoogleSearchTool\",\n    \"VisitWebpageTool\",\n    \"WikipediaSearchTool\",\n    \"SpeechToTextTool\",\n]\n"
  },
  {
    "path": "src/smolagents/gradio_ui.py",
    "content": "#!/usr/bin/env python\n# coding=utf-8\n# Copyright 2024 The HuggingFace Inc. team. All rights reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\nimport os\nimport re\nimport shutil\nfrom pathlib import Path\nfrom typing import Generator\n\nfrom smolagents.agent_types import AgentAudio, AgentImage, AgentText\nfrom smolagents.agents import MultiStepAgent, PlanningStep\nfrom smolagents.memory import ActionStep, FinalAnswerStep\nfrom smolagents.models import ChatMessageStreamDelta, MessageRole, agglomerate_stream_deltas\nfrom smolagents.utils import _is_package_available\n\n\ndef get_step_footnote_content(step_log: ActionStep | PlanningStep, step_name: str) -> str:\n    \"\"\"Get a footnote string for a step log with duration and token information\"\"\"\n    step_footnote = f\"**{step_name}**\"\n    if step_log.token_usage is not None:\n        step_footnote += f\" | Input tokens: {step_log.token_usage.input_tokens:,} | Output tokens: {step_log.token_usage.output_tokens:,}\"\n    step_footnote += f\" | Duration: {round(float(step_log.timing.duration), 2)}s\" if step_log.timing.duration else \"\"\n    step_footnote_content = f\"\"\"<span style=\"color: #bbbbc2; font-size: 12px;\">{step_footnote}</span> \"\"\"\n    return step_footnote_content\n\n\ndef _clean_model_output(model_output: str) -> str:\n    \"\"\"\n    Clean up model output by removing trailing tags and extra backticks.\n\n    Args:\n        model_output (`str`): Raw model output.\n\n    Returns:\n        `str`: Cleaned model output.\n    \"\"\"\n    if not model_output:\n        return \"\"\n    model_output = model_output.strip()\n    # Remove any trailing <end_code> and extra backticks, handling multiple possible formats\n    model_output = re.sub(r\"```\\s*<end_code>\", \"```\", model_output)  # handles ```<end_code>\n    model_output = re.sub(r\"<end_code>\\s*```\", \"```\", model_output)  # handles <end_code>```\n    model_output = re.sub(r\"```\\s*\\n\\s*<end_code>\", \"```\", model_output)  # handles ```\\n<end_code>\n    return model_output.strip()\n\n\ndef _format_code_content(content: str) -> str:\n    \"\"\"\n    Format code content as Python code block if it's not already formatted.\n\n    Args:\n        content (`str`): Code content to format.\n\n    Returns:\n        `str`: Code content formatted as a Python code block.\n    \"\"\"\n    content = content.strip()\n    # Remove existing code blocks and end_code tags\n    content = re.sub(r\"```.*?\\n\", \"\", content)\n    content = re.sub(r\"\\s*<end_code>\\s*\", \"\", content)\n    content = content.strip()\n    # Add Python code block formatting if not already present\n    if not content.startswith(\"```python\"):\n        content = f\"```python\\n{content}\\n```\"\n    return content\n\n\ndef _process_action_step(step_log: ActionStep, skip_model_outputs: bool = False) -> Generator:\n    \"\"\"\n    Process an [`ActionStep`] and yield appropriate Gradio ChatMessage objects.\n\n    Args:\n        step_log ([`ActionStep`]): ActionStep to process.\n        skip_model_outputs (`bool`): Whether to skip model outputs.\n\n    Yields:\n        `gradio.ChatMessage`: Gradio ChatMessages representing the action step.\n    \"\"\"\n    import gradio as gr\n\n    # Output the step number\n    step_number = f\"Step {step_log.step_number}\"\n    if not skip_model_outputs:\n        yield gr.ChatMessage(role=MessageRole.ASSISTANT, content=f\"**{step_number}**\", metadata={\"status\": \"done\"})\n\n    # First yield the thought/reasoning from the LLM\n    if not skip_model_outputs and getattr(step_log, \"model_output\", \"\"):\n        model_output = _clean_model_output(step_log.model_output)\n        yield gr.ChatMessage(role=MessageRole.ASSISTANT, content=model_output, metadata={\"status\": \"done\"})\n\n    # For tool calls, create a parent message\n    if getattr(step_log, \"tool_calls\", []):\n        first_tool_call = step_log.tool_calls[0]\n        used_code = first_tool_call.name == \"python_interpreter\"\n\n        # Process arguments based on type\n        args = first_tool_call.arguments\n        if isinstance(args, dict):\n            content = str(args.get(\"answer\", str(args)))\n        else:\n            content = str(args).strip()\n\n        # Format code content if needed\n        if used_code:\n            content = _format_code_content(content)\n\n        # Create the tool call message\n        parent_message_tool = gr.ChatMessage(\n            role=MessageRole.ASSISTANT,\n            content=content,\n            metadata={\n                \"title\": f\"🛠️ Used tool {first_tool_call.name}\",\n                \"status\": \"done\",\n            },\n        )\n        yield parent_message_tool\n\n    # Display execution logs if they exist\n    if getattr(step_log, \"observations\", \"\") and step_log.observations.strip():\n        log_content = step_log.observations.strip()\n        if log_content:\n            log_content = re.sub(r\"^Execution logs:\\s*\", \"\", log_content)\n            yield gr.ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=f\"```bash\\n{log_content}\\n\",\n                metadata={\"title\": \"📝 Execution Logs\", \"status\": \"done\"},\n            )\n\n    # Display any images in observations\n    if getattr(step_log, \"observations_images\", []):\n        for image in step_log.observations_images:\n            path_image = AgentImage(image).to_string()\n            yield gr.ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content={\"path\": path_image, \"mime_type\": f\"image/{path_image.split('.')[-1]}\"},\n                metadata={\"title\": \"🖼️ Output Image\", \"status\": \"done\"},\n            )\n\n    # Handle errors\n    if getattr(step_log, \"error\", None):\n        yield gr.ChatMessage(\n            role=MessageRole.ASSISTANT, content=str(step_log.error), metadata={\"title\": \"💥 Error\", \"status\": \"done\"}\n        )\n\n    # Add step footnote and separator\n    yield gr.ChatMessage(\n        role=MessageRole.ASSISTANT,\n        content=get_step_footnote_content(step_log, step_number),\n        metadata={\"status\": \"done\"},\n    )\n    yield gr.ChatMessage(role=MessageRole.ASSISTANT, content=\"-----\", metadata={\"status\": \"done\"})\n\n\ndef _process_planning_step(step_log: PlanningStep, skip_model_outputs: bool = False) -> Generator:\n    \"\"\"\n    Process a [`PlanningStep`] and yield appropriate gradio.ChatMessage objects.\n\n    Args:\n        step_log ([`PlanningStep`]): PlanningStep to process.\n\n    Yields:\n        `gradio.ChatMessage`: Gradio ChatMessages representing the planning step.\n    \"\"\"\n    import gradio as gr\n\n    if not skip_model_outputs:\n        yield gr.ChatMessage(role=MessageRole.ASSISTANT, content=\"**Planning step**\", metadata={\"status\": \"done\"})\n        yield gr.ChatMessage(role=MessageRole.ASSISTANT, content=step_log.plan, metadata={\"status\": \"done\"})\n    yield gr.ChatMessage(\n        role=MessageRole.ASSISTANT,\n        content=get_step_footnote_content(step_log, \"Planning step\"),\n        metadata={\"status\": \"done\"},\n    )\n    yield gr.ChatMessage(role=MessageRole.ASSISTANT, content=\"-----\", metadata={\"status\": \"done\"})\n\n\ndef _process_final_answer_step(step_log: FinalAnswerStep) -> Generator:\n    \"\"\"\n    Process a [`FinalAnswerStep`] and yield appropriate gradio.ChatMessage objects.\n\n    Args:\n        step_log ([`FinalAnswerStep`]): FinalAnswerStep to process.\n\n    Yields:\n        `gradio.ChatMessage`: Gradio ChatMessages representing the final answer.\n    \"\"\"\n    import gradio as gr\n\n    final_answer = step_log.output\n    if isinstance(final_answer, AgentText):\n        yield gr.ChatMessage(\n            role=MessageRole.ASSISTANT,\n            content=f\"**Final answer:**\\n{final_answer.to_string()}\\n\",\n            metadata={\"status\": \"done\"},\n        )\n    elif isinstance(final_answer, AgentImage):\n        yield gr.ChatMessage(\n            role=MessageRole.ASSISTANT,\n            content={\"path\": final_answer.to_string(), \"mime_type\": \"image/png\"},\n            metadata={\"status\": \"done\"},\n        )\n    elif isinstance(final_answer, AgentAudio):\n        yield gr.ChatMessage(\n            role=MessageRole.ASSISTANT,\n            content={\"path\": final_answer.to_string(), \"mime_type\": \"audio/wav\"},\n            metadata={\"status\": \"done\"},\n        )\n    else:\n        yield gr.ChatMessage(\n            role=MessageRole.ASSISTANT, content=f\"**Final answer:** {str(final_answer)}\", metadata={\"status\": \"done\"}\n        )\n\n\ndef pull_messages_from_step(step_log: ActionStep | PlanningStep | FinalAnswerStep, skip_model_outputs: bool = False):\n    \"\"\"Extract Gradio ChatMessage objects from agent steps with proper nesting.\n\n    Args:\n        step_log: The step log to display as gr.ChatMessage objects.\n        skip_model_outputs: If True, skip the model outputs when creating the gr.ChatMessage objects:\n            This is used for instance when streaming model outputs have already been displayed.\n    \"\"\"\n    if not _is_package_available(\"gradio\"):\n        raise ModuleNotFoundError(\n            \"Please install 'gradio' extra to use the GradioUI: `pip install 'smolagents[gradio]'`\"\n        )\n    if isinstance(step_log, ActionStep):\n        yield from _process_action_step(step_log, skip_model_outputs)\n    elif isinstance(step_log, PlanningStep):\n        yield from _process_planning_step(step_log, skip_model_outputs)\n    elif isinstance(step_log, FinalAnswerStep):\n        yield from _process_final_answer_step(step_log)\n    else:\n        raise ValueError(f\"Unsupported step type: {type(step_log)}\")\n\n\ndef stream_to_gradio(\n    agent,\n    task: str,\n    task_images: list | None = None,\n    reset_agent_memory: bool = False,\n    additional_args: dict | None = None,\n) -> Generator:\n    \"\"\"Runs an agent with the given task and streams the messages from the agent as gradio ChatMessages.\"\"\"\n\n    if not _is_package_available(\"gradio\"):\n        raise ModuleNotFoundError(\n            \"Please install 'gradio' extra to use the GradioUI: `pip install 'smolagents[gradio]'`\"\n        )\n    accumulated_events: list[ChatMessageStreamDelta] = []\n    for event in agent.run(\n        task, images=task_images, stream=True, reset=reset_agent_memory, additional_args=additional_args\n    ):\n        if isinstance(event, ActionStep | PlanningStep | FinalAnswerStep):\n            for message in pull_messages_from_step(\n                event,\n                # If we're streaming model outputs, no need to display them twice\n                skip_model_outputs=getattr(agent, \"stream_outputs\", False),\n            ):\n                yield message\n            accumulated_events = []\n        elif isinstance(event, ChatMessageStreamDelta):\n            accumulated_events.append(event)\n            text = agglomerate_stream_deltas(accumulated_events).render_as_markdown()\n            yield text\n\n\nclass GradioUI:\n    \"\"\"\n    Gradio interface for interacting with a [`MultiStepAgent`].\n\n    This class provides a web interface to interact with the agent in real-time, allowing users to submit prompts, upload files, and receive responses in a chat-like format.\n    It uses the modern [`gradio.ChatInterface`] component for a native chatbot experience.\n    It can reset the agent's memory at the start of each interaction if desired.\n    It supports file uploads via multimodal input.\n    This class requires the `gradio` extra to be installed: `pip install 'smolagents[gradio]'`.\n\n    Args:\n        agent ([`MultiStepAgent`]): The agent to interact with.\n        file_upload_folder (`str`, *optional*): The folder where uploaded files will be saved.\n            If not provided, file uploads are disabled.\n        reset_agent_memory (`bool`, *optional*, defaults to `False`): Whether to reset the agent's memory at the start of each interaction.\n            If `True`, the agent will not remember previous interactions.\n\n    Raises:\n        ModuleNotFoundError: If the `gradio` extra is not installed.\n\n    Example:\n        ```python\n        from smolagents import CodeAgent, GradioUI, InferenceClientModel\n\n        model = InferenceClientModel(model_id=\"meta-llama/Meta-Llama-3.1-8B-Instruct\")\n        agent = CodeAgent(tools=[], model=model)\n        gradio_ui = GradioUI(agent, file_upload_folder=\"uploads\", reset_agent_memory=True)\n        gradio_ui.launch()\n        ```\n    \"\"\"\n\n    def __init__(self, agent: MultiStepAgent, file_upload_folder: str | None = None, reset_agent_memory: bool = False):\n        if not _is_package_available(\"gradio\"):\n            raise ModuleNotFoundError(\n                \"Please install 'gradio' extra to use the GradioUI: `pip install 'smolagents[gradio]'`\"\n            )\n        self.agent = agent\n        self.file_upload_folder = Path(file_upload_folder) if file_upload_folder is not None else None\n        self.reset_agent_memory = reset_agent_memory\n        self.name = getattr(agent, \"name\") or \"Agent interface\"\n        self.description = getattr(agent, \"description\", None)\n        if self.file_upload_folder is not None:\n            if not self.file_upload_folder.exists():\n                self.file_upload_folder.mkdir(parents=True, exist_ok=True)\n\n    def _save_uploaded_file(self, file_path: str) -> str:\n        \"\"\"Save an uploaded file to the upload folder and return the new path.\"\"\"\n        if self.file_upload_folder is None:\n            return file_path\n\n        original_name = os.path.basename(file_path)\n        sanitized_name = re.sub(r\"[^\\w\\-.]\", \"_\", original_name)\n        dest_path = os.path.join(self.file_upload_folder, sanitized_name)\n        shutil.copy(file_path, dest_path)\n        return dest_path\n\n    def upload_file(self, file, file_uploads_log: list, allowed_file_types: list | None = None):\n        \"\"\"\n        Handle file upload with validation.\n\n        Args:\n            file: The uploaded file object.\n            file_uploads_log: List to track uploaded files.\n            allowed_file_types: List of allowed extensions. Defaults to [\".pdf\", \".docx\", \".txt\"].\n\n        Returns:\n            Tuple of (status textbox, updated file log).\n        \"\"\"\n        import gradio as gr\n\n        if file is None:\n            return gr.Textbox(value=\"No file uploaded\", visible=True), file_uploads_log\n\n        if allowed_file_types is None:\n            allowed_file_types = [\".pdf\", \".docx\", \".txt\"]\n\n        file_ext = os.path.splitext(file.name)[1].lower()\n        if file_ext not in allowed_file_types:\n            return gr.Textbox(value=\"File type disallowed\", visible=True), file_uploads_log\n\n        file_path = self._save_uploaded_file(file.name)\n        return gr.Textbox(value=f\"File uploaded: {file_path}\", visible=True), file_uploads_log + [file_path]\n\n    def _process_message(self, message: str | dict) -> tuple[str, list[str] | None]:\n        \"\"\"Process incoming message and extract text and files.\"\"\"\n        if isinstance(message, str):\n            return message, None\n\n        text = message.get(\"text\", \"\")\n        files = message.get(\"files\", [])\n\n        if files and self.file_upload_folder:\n            saved_files = [self._save_uploaded_file(f) for f in files]\n            if saved_files:\n                text += f\"\\nYou have been provided with these files: {saved_files}\"\n            return text, saved_files\n\n        return text, files if files else None\n\n    def _stream_response(self, message: str | dict, history: list[dict]) -> Generator:  # noqa: ARG002\n        \"\"\"Stream agent responses for ChatInterface.\"\"\"\n        import gradio as gr\n\n        task, task_files = self._process_message(message)\n\n        all_messages: list[gr.ChatMessage] = []\n        accumulated_events: list[ChatMessageStreamDelta] = []\n        streaming_msg_idx: int | None = None\n\n        for event in self.agent.run(\n            task, images=task_files, stream=True, reset=self.reset_agent_memory, additional_args=None\n        ):\n            if isinstance(event, ActionStep | PlanningStep | FinalAnswerStep):\n                # Remove streaming message if present\n                if streaming_msg_idx is not None:\n                    all_messages.pop(streaming_msg_idx)\n                    streaming_msg_idx = None\n\n                for msg in pull_messages_from_step(\n                    event,\n                    skip_model_outputs=getattr(self.agent, \"stream_outputs\", False),\n                ):\n                    all_messages.append(\n                        gr.ChatMessage(\n                            role=msg.role,\n                            content=msg.content,\n                            metadata=msg.metadata,\n                        )\n                    )\n                    yield all_messages\n                accumulated_events = []\n            elif isinstance(event, ChatMessageStreamDelta):\n                accumulated_events.append(event)\n                text = agglomerate_stream_deltas(accumulated_events).render_as_markdown()\n                text = text.replace(\"<\", r\"\\<\").replace(\">\", r\"\\>\")\n                msg = gr.ChatMessage(role=\"assistant\", content=text)\n                if streaming_msg_idx is None:\n                    streaming_msg_idx = len(all_messages)\n                    all_messages.append(msg)\n                else:\n                    all_messages[streaming_msg_idx] = msg\n                yield all_messages\n\n    def launch(self, share: bool = True, **kwargs):\n        \"\"\"\n        Launch the Gradio app with the agent interface.\n\n        Args:\n            share (`bool`, defaults to `True`): Whether to share the app publicly.\n            **kwargs: Additional keyword arguments to pass to the Gradio launch method.\n        \"\"\"\n        self.create_app().launch(debug=True, share=share, **kwargs)\n\n    def create_app(self):\n        import gradio as gr\n\n        # Gradio 5.x requires type=\"messages\", but Gradio 6 removed this parameter\n        type_messages_kwarg = {\"type\": \"messages\"} if gr.__version__.startswith(\"5\") else {}\n\n        chatbot = gr.Chatbot(\n            label=\"Agent\",\n            avatar_images=(\n                None,\n                \"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/mascot_smol.png\",\n            ),\n            latex_delimiters=[\n                {\"left\": r\"$$\", \"right\": r\"$$\", \"display\": True},\n                {\"left\": r\"$\", \"right\": r\"$\", \"display\": False},\n                {\"left\": r\"\\[\", \"right\": r\"\\]\", \"display\": True},\n                {\"left\": r\"\\(\", \"right\": r\"\\)\", \"display\": False},\n            ],\n            **type_messages_kwarg,\n        )\n\n        demo = gr.ChatInterface(\n            fn=self._stream_response,\n            chatbot=chatbot,\n            title=self.name.replace(\"_\", \" \").capitalize(),\n            multimodal=self.file_upload_folder is not None,\n            save_history=True,\n            **type_messages_kwarg,\n        )\n        return demo\n\n\n__all__ = [\"stream_to_gradio\", \"GradioUI\"]\n"
  },
  {
    "path": "src/smolagents/local_python_executor.py",
    "content": "#!/usr/bin/env python\n# coding=utf-8\n\n# Copyright 2024 The HuggingFace Inc. team. All rights reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\nimport ast\nimport builtins\nimport difflib\nimport inspect\nimport logging\nimport math\nimport re\nfrom abc import ABC, abstractmethod\nfrom collections.abc import Callable, Generator, Mapping\nfrom concurrent.futures import ThreadPoolExecutor\nfrom concurrent.futures import TimeoutError as FuturesTimeoutError\nfrom dataclasses import dataclass\nfrom functools import wraps\nfrom importlib import import_module\nfrom importlib.util import find_spec\nfrom types import BuiltinFunctionType, FunctionType, ModuleType\nfrom typing import Any\n\nfrom .tools import Tool\nfrom .utils import BASE_BUILTIN_MODULES, truncate_content\n\n\nlogger = logging.getLogger(__name__)\n\n\nclass InterpreterError(ValueError):\n    \"\"\"\n    An error raised when the interpreter cannot evaluate a Python expression, due to syntax error or unsupported\n    operations.\n    \"\"\"\n\n    pass\n\n\nERRORS = {\n    name: getattr(builtins, name)\n    for name in dir(builtins)\n    if isinstance(getattr(builtins, name), type) and issubclass(getattr(builtins, name), BaseException)\n}\n\nDEFAULT_MAX_LEN_OUTPUT = 50000\nMAX_OPERATIONS = 10000000\nMAX_WHILE_ITERATIONS = 1000000\nMAX_EXECUTION_TIME_SECONDS = 30\nALLOWED_DUNDER_METHODS = [\"__init__\", \"__str__\", \"__repr__\"]\n\n\ndef custom_print(*args):\n    return None\n\n\ndef nodunder_getattr(obj, name, default=None):\n    if name.startswith(\"__\") and name.endswith(\"__\"):\n        raise InterpreterError(f\"Forbidden access to dunder attribute: {name}\")\n    return getattr(obj, name, default)\n\n\nBASE_PYTHON_TOOLS = {\n    \"print\": custom_print,\n    \"isinstance\": isinstance,\n    \"range\": range,\n    \"float\": float,\n    \"int\": int,\n    \"bool\": bool,\n    \"str\": str,\n    \"set\": set,\n    \"list\": list,\n    \"dict\": dict,\n    \"tuple\": tuple,\n    \"round\": round,\n    \"ceil\": math.ceil,\n    \"floor\": math.floor,\n    \"log\": math.log,\n    \"exp\": math.exp,\n    \"sin\": math.sin,\n    \"cos\": math.cos,\n    \"tan\": math.tan,\n    \"asin\": math.asin,\n    \"acos\": math.acos,\n    \"atan\": math.atan,\n    \"atan2\": math.atan2,\n    \"degrees\": math.degrees,\n    \"radians\": math.radians,\n    \"pow\": pow,\n    \"sqrt\": math.sqrt,\n    \"len\": len,\n    \"sum\": sum,\n    \"max\": max,\n    \"min\": min,\n    \"abs\": abs,\n    \"enumerate\": enumerate,\n    \"zip\": zip,\n    \"reversed\": reversed,\n    \"sorted\": sorted,\n    \"all\": all,\n    \"any\": any,\n    \"map\": map,\n    \"filter\": filter,\n    \"ord\": ord,\n    \"chr\": chr,\n    \"next\": next,\n    \"iter\": iter,\n    \"divmod\": divmod,\n    \"callable\": callable,\n    \"getattr\": nodunder_getattr,\n    \"hasattr\": hasattr,\n    \"setattr\": setattr,\n    \"issubclass\": issubclass,\n    \"type\": type,\n    \"complex\": complex,\n}\n\n# Non-exhaustive list of dangerous modules that should not be imported\nDANGEROUS_MODULES = [\n    \"builtins\",\n    \"io\",\n    \"multiprocessing\",\n    \"os\",\n    \"pathlib\",\n    \"pty\",\n    \"shutil\",\n    \"socket\",\n    \"subprocess\",\n    \"sys\",\n]\n\nDANGEROUS_FUNCTIONS = [\n    \"builtins.compile\",\n    \"builtins.eval\",\n    \"builtins.exec\",\n    \"builtins.globals\",\n    \"builtins.locals\",\n    \"builtins.__import__\",\n    \"os.popen\",\n    \"os.system\",\n    \"posix.system\",\n]\n\n\ndef check_safer_result(result: Any, static_tools: dict[str, Callable] = None, authorized_imports: list[str] = None):\n    \"\"\"\n    Checks if a result is safer according to authorized imports and static tools.\n\n    Args:\n        result (Any): The result to check.\n        static_tools (dict[str, Callable]): Dictionary of static tools.\n        authorized_imports (list[str]): List of authorized imports.\n\n    Raises:\n        InterpreterError: If the result is not safe\n    \"\"\"\n    if isinstance(result, ModuleType):\n        if not check_import_authorized(result.__name__, authorized_imports):\n            raise InterpreterError(f\"Forbidden access to module: {result.__name__}\")\n    elif isinstance(result, dict) and result.get(\"__spec__\"):\n        if not check_import_authorized(result[\"__name__\"], authorized_imports):\n            raise InterpreterError(f\"Forbidden access to module: {result['__name__']}\")\n    elif isinstance(result, (FunctionType, BuiltinFunctionType)):\n        for qualified_function_name in DANGEROUS_FUNCTIONS:\n            module_name, function_name = qualified_function_name.rsplit(\".\", 1)\n            if (\n                (static_tools is None or function_name not in static_tools)\n                and result.__name__ == function_name\n                and result.__module__ == module_name\n            ):\n                raise InterpreterError(f\"Forbidden access to function: {function_name}\")\n\n\ndef safer_eval(func: Callable):\n    \"\"\"\n    Decorator to enhance the security of an evaluation function by checking its return value.\n\n    Args:\n        func (Callable): Evaluation function to be made safer.\n\n    Returns:\n        Callable: Safer evaluation function with return value check.\n    \"\"\"\n\n    @wraps(func)\n    def _check_return(\n        expression,\n        state,\n        static_tools,\n        custom_tools,\n        authorized_imports=BASE_BUILTIN_MODULES,\n    ):\n        result = func(expression, state, static_tools, custom_tools, authorized_imports=authorized_imports)\n        check_safer_result(result, static_tools, authorized_imports)\n        return result\n\n    return _check_return\n\n\ndef safer_func(\n    func: Callable,\n    static_tools: dict[str, Callable] = BASE_PYTHON_TOOLS,\n    authorized_imports: list[str] = BASE_BUILTIN_MODULES,\n):\n    \"\"\"\n    Decorator to enhance the security of a function call by checking its return value.\n\n    Args:\n        func (Callable): Function to be made safer.\n        static_tools (dict[str, Callable]): Dictionary of static tools.\n        authorized_imports (list[str]): List of authorized imports.\n\n    Returns:\n        Callable: Safer function with return value check.\n    \"\"\"\n    # If the function is a type, return it directly without wrapping\n    if isinstance(func, type):\n        return func\n\n    @wraps(func)\n    def _check_return(*args, **kwargs):\n        result = func(*args, **kwargs)\n        check_safer_result(result, static_tools, authorized_imports)\n        return result\n\n    return _check_return\n\n\nclass PrintContainer:\n    def __init__(self):\n        self.value = \"\"\n\n    def append(self, text):\n        self.value += text\n        return self\n\n    def __iadd__(self, other):\n        \"\"\"Implements the += operator\"\"\"\n        self.value += str(other)\n        return self\n\n    def __str__(self):\n        \"\"\"String representation\"\"\"\n        return self.value\n\n    def __repr__(self):\n        \"\"\"Representation for debugging\"\"\"\n        return f\"PrintContainer({self.value})\"\n\n    def __len__(self):\n        \"\"\"Implements len() function support\"\"\"\n        return len(self.value)\n\n\nclass BreakException(Exception):\n    pass\n\n\nclass ContinueException(Exception):\n    pass\n\n\nclass ReturnException(Exception):\n    def __init__(self, value):\n        self.value = value\n\n\nclass ExecutionTimeoutError(Exception):\n    \"\"\"Exception raised when code execution exceeds the maximum allowed time.\"\"\"\n\n    pass\n\n\ndef timeout(timeout_seconds: int):\n    \"\"\"\n    Decorator to limit the execution time of a function using threading.\n\n    This implementation is cross-platform (works on Windows) and thread-safe (works when\n    called from any thread, not just the main thread), unlike signal-based approaches.\n\n    Args:\n        timeout_seconds (`int`): Maximum time in seconds allowed for function execution.\n\n    Raises:\n        ExecutionTimeoutError: If the function execution exceeds the timeout period.\n\n    Note:\n        If a timeout occurs, the thread running the function cannot be forcefully killed\n        in Python, so it will continue running in the background until completion. However,\n        the caller will receive a TimeoutError and can continue execution.\n    \"\"\"\n\n    def decorator(func):\n        @wraps(func)\n        def wrapper(*args, **kwargs):\n            # Create a new ThreadPoolExecutor for each call to avoid threading issues\n            with ThreadPoolExecutor(max_workers=1) as executor:\n                future = executor.submit(func, *args, **kwargs)\n                try:\n                    result = future.result(timeout=timeout_seconds)\n                    return result\n                except FuturesTimeoutError:\n                    raise ExecutionTimeoutError(\n                        f\"Code execution exceeded the maximum execution time of {timeout_seconds} seconds\"\n                    )\n\n        return wrapper\n\n    return decorator\n\n\ndef get_iterable(obj):\n    if isinstance(obj, list):\n        return obj\n    elif hasattr(obj, \"__iter__\"):\n        return list(obj)\n    else:\n        raise InterpreterError(\"Object is not iterable\")\n\n\ndef fix_final_answer_code(code: str) -> str:\n    \"\"\"\n    Sometimes an LLM can try to assign a variable to final_answer, which would break the final_answer() tool.\n    This function fixes this behaviour by replacing variable assignments to final_answer with final_answer_variable,\n    while preserving function calls to final_answer().\n    \"\"\"\n    # First, find if there's a direct assignment to final_answer\n    # Use word boundary and negative lookbehind to ensure it's not an object attribute\n    assignment_pattern = r\"(?<!\\.)(?<!\\w)\\bfinal_answer\\s*=\"\n    if \"final_answer(\" not in code or not re.search(assignment_pattern, code):\n        # If final_answer tool is not called in this blob, then doing the replacement is hazardous because it could false the model's memory for next steps.\n        # Let's not modify the code and leave the subsequent assignment error happen.\n        return code\n\n    # Pattern for replacing variable assignments\n    # Looks for 'final_answer' followed by '=' with optional whitespace\n    # Negative lookbehind ensures we don't match object attributes\n    assignment_regex = r\"(?<!\\.)(?<!\\w)(\\bfinal_answer)(\\s*=)\"\n    code = re.sub(assignment_regex, r\"final_answer_variable\\2\", code)\n\n    # Pattern for replacing variable usage but not function calls\n    # Negative lookahead (?!\\s*\\() ensures we don't match function calls\n    # Negative lookbehind (?<!\\.|\\w) ensures we don't match object methods or other variables\n    variable_regex = r\"(?<!\\.)(?<!\\w)(\\bfinal_answer\\b)(?!\\s*\\()\"\n    code = re.sub(variable_regex, \"final_answer_variable\", code)\n    return code\n\n\ndef build_import_tree(authorized_imports: list[str]) -> dict[str, Any]:\n    tree = {}\n    for import_path in authorized_imports:\n        parts = import_path.split(\".\")\n        current = tree\n        for part in parts:\n            if part not in current:\n                current[part] = {}\n            current = current[part]\n    return tree\n\n\ndef check_import_authorized(import_to_check: str, authorized_imports: list[str]) -> bool:\n    current_node = build_import_tree(authorized_imports)\n    for part in import_to_check.split(\".\"):\n        if \"*\" in current_node:\n            return True\n        if part not in current_node:\n            return False\n        current_node = current_node[part]\n    return True\n\n\ndef evaluate_attribute(\n    expression: ast.Attribute,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> Any:\n    if expression.attr.startswith(\"__\") and expression.attr.endswith(\"__\"):\n        raise InterpreterError(f\"Forbidden access to dunder attribute: {expression.attr}\")\n    value = evaluate_ast(expression.value, state, static_tools, custom_tools, authorized_imports)\n    return getattr(value, expression.attr)\n\n\ndef evaluate_unaryop(\n    expression: ast.UnaryOp,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> Any:\n    operand = evaluate_ast(expression.operand, state, static_tools, custom_tools, authorized_imports)\n    if isinstance(expression.op, ast.USub):\n        return -operand\n    elif isinstance(expression.op, ast.UAdd):\n        return operand\n    elif isinstance(expression.op, ast.Not):\n        return not operand\n    elif isinstance(expression.op, ast.Invert):\n        return ~operand\n    else:\n        raise InterpreterError(f\"Unary operation {expression.op.__class__.__name__} is not supported.\")\n\n\ndef evaluate_lambda(\n    lambda_expression: ast.Lambda,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> Callable:\n    args = [arg.arg for arg in lambda_expression.args.args]\n\n    def lambda_func(*values: Any) -> Any:\n        new_state = state.copy()\n        for arg, value in zip(args, values):\n            new_state[arg] = value\n        return evaluate_ast(\n            lambda_expression.body,\n            new_state,\n            static_tools,\n            custom_tools,\n            authorized_imports,\n        )\n\n    return lambda_func\n\n\ndef evaluate_while(\n    while_loop: ast.While,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> None:\n    iterations = 0\n    while evaluate_ast(while_loop.test, state, static_tools, custom_tools, authorized_imports):\n        for node in while_loop.body:\n            try:\n                evaluate_ast(node, state, static_tools, custom_tools, authorized_imports)\n            except BreakException:\n                return None\n            except ContinueException:\n                break\n        iterations += 1\n        if iterations > MAX_WHILE_ITERATIONS:\n            raise InterpreterError(f\"Maximum number of {MAX_WHILE_ITERATIONS} iterations in While loop exceeded\")\n    return None\n\n\ndef create_function(\n    func_def: ast.FunctionDef,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> Callable:\n    source_code = ast.unparse(func_def)\n\n    def new_func(*args: Any, **kwargs: Any) -> Any:\n        func_state = state.copy()\n        arg_names = [arg.arg for arg in func_def.args.args]\n        default_values = [\n            evaluate_ast(d, state, static_tools, custom_tools, authorized_imports) for d in func_def.args.defaults\n        ]\n\n        # Apply default values\n        defaults = dict(zip(arg_names[-len(default_values) :], default_values))\n\n        # Set positional arguments\n        for name, value in zip(arg_names, args):\n            func_state[name] = value\n\n        # Set keyword arguments\n        for name, value in kwargs.items():\n            func_state[name] = value\n\n        # Handle variable arguments\n        if func_def.args.vararg:\n            vararg_name = func_def.args.vararg.arg\n            func_state[vararg_name] = args\n\n        if func_def.args.kwarg:\n            kwarg_name = func_def.args.kwarg.arg\n            func_state[kwarg_name] = kwargs\n\n        # Set default values for arguments that were not provided\n        for name, value in defaults.items():\n            if name not in func_state:\n                func_state[name] = value\n\n        # Update function state with self and __class__\n        if func_def.args.args and func_def.args.args[0].arg == \"self\":\n            if args:\n                func_state[\"self\"] = args[0]\n                func_state[\"__class__\"] = args[0].__class__\n\n        result = None\n        try:\n            for stmt in func_def.body:\n                result = evaluate_ast(stmt, func_state, static_tools, custom_tools, authorized_imports)\n        except ReturnException as e:\n            result = e.value\n\n        if func_def.name == \"__init__\":\n            return None\n\n        return result\n\n    # Store original AST, source code, and name\n    new_func.__ast__ = func_def\n    new_func.__source__ = source_code\n    new_func.__name__ = func_def.name\n\n    return new_func\n\n\ndef evaluate_function_def(\n    func_def: ast.FunctionDef,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> Callable:\n    custom_tools[func_def.name] = create_function(func_def, state, static_tools, custom_tools, authorized_imports)\n    return custom_tools[func_def.name]\n\n\ndef evaluate_class_def(\n    class_def: ast.ClassDef,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> type:\n    class_name = class_def.name\n    bases = [evaluate_ast(base, state, static_tools, custom_tools, authorized_imports) for base in class_def.bases]\n\n    # Determine the metaclass to use\n    # If any base class has a custom metaclass, use it\n    metaclass = type\n    for base in bases:\n        base_metaclass = type(base)\n        if base_metaclass is not type:\n            metaclass = base_metaclass\n            break\n\n    # Use __prepare__ if the metaclass provides it (e.g., Enum uses _EnumDict)\n    if hasattr(metaclass, \"__prepare__\"):\n        class_dict = metaclass.__prepare__(class_name, bases)\n    else:\n        class_dict = {}\n\n    for stmt in class_def.body:\n        if isinstance(stmt, ast.FunctionDef):\n            class_dict[stmt.name] = evaluate_ast(stmt, state, static_tools, custom_tools, authorized_imports)\n        elif isinstance(stmt, ast.AnnAssign):\n            if stmt.value:\n                value = evaluate_ast(stmt.value, state, static_tools, custom_tools, authorized_imports)\n            target = stmt.target\n            # Handle target types for annotation\n            if isinstance(target, ast.Name):\n                # Simple variable annotation like \"x: int\"\n                annotation = evaluate_ast(stmt.annotation, state, static_tools, custom_tools, authorized_imports)\n                class_dict.setdefault(\"__annotations__\", {})[target.id] = annotation\n                # Assign value if provided\n                if stmt.value:\n                    class_dict[target.id] = value\n            elif isinstance(target, ast.Attribute):\n                # Attribute annotation like \"obj.attr: int\"\n                obj = evaluate_ast(target.value, class_dict, static_tools, custom_tools, authorized_imports)\n                # If there's a value assignment, set the attribute\n                if stmt.value:\n                    setattr(obj, target.attr, value)\n            elif isinstance(target, ast.Subscript):\n                # Subscript annotation like \"dict[key]: int\"\n                container = evaluate_ast(target.value, class_dict, static_tools, custom_tools, authorized_imports)\n                index = evaluate_ast(target.slice, state, static_tools, custom_tools, authorized_imports)\n                # If there's a value assignment, set the item\n                if stmt.value:\n                    container[index] = value\n            else:\n                raise InterpreterError(f\"Unsupported AnnAssign target in class body: {type(target).__name__}\")\n        elif isinstance(stmt, ast.Assign):\n            value = evaluate_ast(stmt.value, state, static_tools, custom_tools, authorized_imports)\n            for target in stmt.targets:\n                if isinstance(target, ast.Name):\n                    class_dict[target.id] = value\n                elif isinstance(target, ast.Attribute):\n                    obj = evaluate_ast(target.value, class_dict, static_tools, custom_tools, authorized_imports)\n                    setattr(obj, target.attr, value)\n        elif isinstance(stmt, ast.Pass):\n            pass\n        elif (\n            isinstance(stmt, ast.Expr)\n            and stmt == class_def.body[0]\n            and isinstance(stmt.value, ast.Constant)\n            and isinstance(stmt.value.value, str)\n        ):\n            # Check if it is a docstring: first statement in class body which is a string literal expression\n            class_dict[\"__doc__\"] = stmt.value.value\n        else:\n            raise InterpreterError(f\"Unsupported statement in class body: {stmt.__class__.__name__}\")\n\n    new_class = metaclass(class_name, tuple(bases), class_dict)\n    state[class_name] = new_class\n    return new_class\n\n\ndef evaluate_annassign(\n    annassign: ast.AnnAssign,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> Any:\n    # If there's a value to assign, evaluate it\n    if annassign.value:\n        value = evaluate_ast(annassign.value, state, static_tools, custom_tools, authorized_imports)\n        # Set the value for the target\n        set_value(annassign.target, value, state, static_tools, custom_tools, authorized_imports)\n        return value\n    # For declarations without values (x: int), just return None\n    return None\n\n\ndef evaluate_augassign(\n    expression: ast.AugAssign,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> Any:\n    def get_current_value(target: ast.AST) -> Any:\n        if isinstance(target, ast.Name):\n            return state.get(target.id, 0)\n        elif isinstance(target, ast.Subscript):\n            obj = evaluate_ast(target.value, state, static_tools, custom_tools, authorized_imports)\n            key = evaluate_ast(target.slice, state, static_tools, custom_tools, authorized_imports)\n            return obj[key]\n        elif isinstance(target, ast.Attribute):\n            obj = evaluate_ast(target.value, state, static_tools, custom_tools, authorized_imports)\n            return getattr(obj, target.attr)\n        elif isinstance(target, ast.Tuple):\n            return tuple(get_current_value(elt) for elt in target.elts)\n        elif isinstance(target, ast.List):\n            return [get_current_value(elt) for elt in target.elts]\n        else:\n            raise InterpreterError(\"AugAssign not supported for {type(target)} targets.\")\n\n    current_value = get_current_value(expression.target)\n    value_to_add = evaluate_ast(expression.value, state, static_tools, custom_tools, authorized_imports)\n\n    if isinstance(expression.op, ast.Add):\n        if isinstance(current_value, list):\n            if not isinstance(value_to_add, list):\n                raise InterpreterError(f\"Cannot add non-list value {value_to_add} to a list.\")\n            current_value += value_to_add\n        else:\n            current_value += value_to_add\n    elif isinstance(expression.op, ast.Sub):\n        current_value -= value_to_add\n    elif isinstance(expression.op, ast.Mult):\n        current_value *= value_to_add\n    elif isinstance(expression.op, ast.Div):\n        current_value /= value_to_add\n    elif isinstance(expression.op, ast.Mod):\n        current_value %= value_to_add\n    elif isinstance(expression.op, ast.Pow):\n        current_value **= value_to_add\n    elif isinstance(expression.op, ast.FloorDiv):\n        current_value //= value_to_add\n    elif isinstance(expression.op, ast.BitAnd):\n        current_value &= value_to_add\n    elif isinstance(expression.op, ast.BitOr):\n        current_value |= value_to_add\n    elif isinstance(expression.op, ast.BitXor):\n        current_value ^= value_to_add\n    elif isinstance(expression.op, ast.LShift):\n        current_value <<= value_to_add\n    elif isinstance(expression.op, ast.RShift):\n        current_value >>= value_to_add\n    else:\n        raise InterpreterError(f\"Operation {type(expression.op).__name__} is not supported.\")\n\n    # Update the state: current_value has been updated in-place\n    set_value(\n        expression.target,\n        current_value,\n        state,\n        static_tools,\n        custom_tools,\n        authorized_imports,\n    )\n\n    return current_value\n\n\ndef evaluate_boolop(\n    node: ast.BoolOp,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> Any:\n    # Determine which value should trigger short-circuit based on operation type:\n    # - 'and' returns the first falsy value encountered (or the last value if all are truthy)\n    # - 'or' returns the first truthy value encountered (or the last value if all are falsy)\n    is_short_circuit_value = (lambda x: not x) if isinstance(node.op, ast.And) else (lambda x: bool(x))\n    for value in node.values:\n        result = evaluate_ast(value, state, static_tools, custom_tools, authorized_imports)\n        # Short-circuit: return immediately if the condition is met\n        if is_short_circuit_value(result):\n            return result\n    # If no short-circuit occurred, return the last evaluated value\n    return result\n\n\ndef evaluate_binop(\n    binop: ast.BinOp,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> Any:\n    # Recursively evaluate the left and right operands\n    left_val = evaluate_ast(binop.left, state, static_tools, custom_tools, authorized_imports)\n    right_val = evaluate_ast(binop.right, state, static_tools, custom_tools, authorized_imports)\n\n    # Determine the operation based on the type of the operator in the BinOp\n    if isinstance(binop.op, ast.Add):\n        return left_val + right_val\n    elif isinstance(binop.op, ast.Sub):\n        return left_val - right_val\n    elif isinstance(binop.op, ast.Mult):\n        return left_val * right_val\n    elif isinstance(binop.op, ast.Div):\n        return left_val / right_val\n    elif isinstance(binop.op, ast.Mod):\n        return left_val % right_val\n    elif isinstance(binop.op, ast.Pow):\n        return left_val**right_val\n    elif isinstance(binop.op, ast.FloorDiv):\n        return left_val // right_val\n    elif isinstance(binop.op, ast.BitAnd):\n        return left_val & right_val\n    elif isinstance(binop.op, ast.BitOr):\n        return left_val | right_val\n    elif isinstance(binop.op, ast.BitXor):\n        return left_val ^ right_val\n    elif isinstance(binop.op, ast.LShift):\n        return left_val << right_val\n    elif isinstance(binop.op, ast.RShift):\n        return left_val >> right_val\n    else:\n        raise NotImplementedError(f\"Binary operation {type(binop.op).__name__} is not implemented.\")\n\n\ndef evaluate_assign(\n    assign: ast.Assign,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> Any:\n    result = evaluate_ast(assign.value, state, static_tools, custom_tools, authorized_imports)\n    if len(assign.targets) == 1:\n        target = assign.targets[0]\n        set_value(target, result, state, static_tools, custom_tools, authorized_imports)\n    else:\n        expanded_values = []\n        for tgt in assign.targets:\n            if isinstance(tgt, ast.Starred):\n                expanded_values.extend(result)\n            else:\n                expanded_values.append(result)\n\n        for tgt, val in zip(assign.targets, expanded_values):\n            set_value(tgt, val, state, static_tools, custom_tools, authorized_imports)\n    return result\n\n\ndef set_value(\n    target: ast.AST,\n    value: Any,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> None:\n    if isinstance(target, ast.Name):\n        if target.id in static_tools:\n            raise InterpreterError(f\"Cannot assign to name '{target.id}': doing this would erase the existing tool!\")\n        state[target.id] = value\n    elif isinstance(target, ast.Tuple):\n        if not isinstance(value, tuple):\n            if hasattr(value, \"__iter__\") and not isinstance(value, (str, bytes)):\n                value = tuple(value)\n            else:\n                raise InterpreterError(\"Cannot unpack non-tuple value\")\n        if len(target.elts) != len(value):\n            raise InterpreterError(\"Cannot unpack tuple of wrong size\")\n        for i, elem in enumerate(target.elts):\n            set_value(elem, value[i], state, static_tools, custom_tools, authorized_imports)\n    elif isinstance(target, ast.Subscript):\n        obj = evaluate_ast(target.value, state, static_tools, custom_tools, authorized_imports)\n        key = evaluate_ast(target.slice, state, static_tools, custom_tools, authorized_imports)\n        obj[key] = value\n    elif isinstance(target, ast.Attribute):\n        obj = evaluate_ast(target.value, state, static_tools, custom_tools, authorized_imports)\n        setattr(obj, target.attr, value)\n\n\ndef evaluate_call(\n    call: ast.Call,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> Any:\n    if not isinstance(call.func, (ast.Call, ast.Lambda, ast.Attribute, ast.Name, ast.Subscript)):\n        raise InterpreterError(f\"This is not a correct function: {call.func}).\")\n\n    func, func_name = None, None\n\n    if isinstance(call.func, ast.Call):\n        func = evaluate_ast(call.func, state, static_tools, custom_tools, authorized_imports)\n    elif isinstance(call.func, ast.Lambda):\n        func = evaluate_ast(call.func, state, static_tools, custom_tools, authorized_imports)\n    elif isinstance(call.func, ast.Attribute):\n        obj = evaluate_ast(call.func.value, state, static_tools, custom_tools, authorized_imports)\n        func_name = call.func.attr\n        if not hasattr(obj, func_name):\n            raise InterpreterError(f\"Object {obj} has no attribute {func_name}\")\n        func = getattr(obj, func_name)\n    elif isinstance(call.func, ast.Name):\n        func_name = call.func.id\n        if func_name in state:\n            func = state[func_name]\n        elif func_name in static_tools:\n            func = static_tools[func_name]\n        elif func_name in custom_tools:\n            func = custom_tools[func_name]\n        elif func_name in ERRORS:\n            func = ERRORS[func_name]\n        else:\n            raise InterpreterError(\n                f\"Forbidden function evaluation: '{call.func.id}' is not among the explicitly allowed tools or defined/imported in the preceding code\"\n            )\n    elif isinstance(call.func, ast.Subscript):\n        func = evaluate_ast(call.func, state, static_tools, custom_tools, authorized_imports)\n        if not callable(func):\n            raise InterpreterError(f\"This is not a correct function: {call.func}).\")\n        func_name = None\n\n    args = []\n    for arg in call.args:\n        if isinstance(arg, ast.Starred):\n            args.extend(evaluate_ast(arg.value, state, static_tools, custom_tools, authorized_imports))\n        else:\n            args.append(evaluate_ast(arg, state, static_tools, custom_tools, authorized_imports))\n\n    kwargs = {}\n    for keyword in call.keywords:\n        if keyword.arg is None:\n            # **kwargs unpacking\n            starred_dict = evaluate_ast(keyword.value, state, static_tools, custom_tools, authorized_imports)\n            if not isinstance(starred_dict, dict):\n                raise InterpreterError(f\"Cannot unpack non-dict value in **kwargs: {type(starred_dict).__name__}\")\n            kwargs.update(starred_dict)\n        else:\n            # Normal keyword argument\n            kwargs[keyword.arg] = evaluate_ast(keyword.value, state, static_tools, custom_tools, authorized_imports)\n\n    if func_name == \"super\":\n        if not args:\n            if \"__class__\" in state and \"self\" in state:\n                return super(state[\"__class__\"], state[\"self\"])\n            else:\n                raise InterpreterError(\"super() needs at least one argument\")\n        cls = args[0]\n        if not isinstance(cls, type):\n            raise InterpreterError(\"super() argument 1 must be type\")\n        if len(args) == 1:\n            return super(cls)\n        elif len(args) == 2:\n            instance = args[1]\n            return super(cls, instance)\n        else:\n            raise InterpreterError(\"super() takes at most 2 arguments\")\n    elif func_name == \"print\":\n        state[\"_print_outputs\"] += \" \".join(map(str, args)) + \"\\n\"\n        return None\n    else:  # Assume it's a callable object\n        if (inspect.getmodule(func) == builtins) and inspect.isbuiltin(func) and (func not in static_tools.values()):\n            raise InterpreterError(\n                f\"Invoking a builtin function that has not been explicitly added as a tool is not allowed ({func_name}).\"\n            )\n        if (\n            hasattr(func, \"__name__\")\n            and func.__name__.startswith(\"__\")\n            and func.__name__.endswith(\"__\")\n            and (func.__name__ not in static_tools)\n            and (func.__name__ not in ALLOWED_DUNDER_METHODS)\n        ):\n            raise InterpreterError(f\"Forbidden call to dunder function: {func.__name__}\")\n        return func(*args, **kwargs)\n\n\ndef evaluate_subscript(\n    subscript: ast.Subscript,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> Any:\n    index = evaluate_ast(subscript.slice, state, static_tools, custom_tools, authorized_imports)\n    value = evaluate_ast(subscript.value, state, static_tools, custom_tools, authorized_imports)\n    try:\n        return value[index]\n    except (KeyError, IndexError, TypeError) as e:\n        error_message = f\"Could not index {value} with '{index}': {type(e).__name__}: {e}\"\n        if isinstance(index, str) and isinstance(value, Mapping):\n            close_matches = difflib.get_close_matches(index, list(value.keys()))\n            if len(close_matches) > 0:\n                error_message += f\". Maybe you meant one of these indexes instead: {str(close_matches)}\"\n        raise InterpreterError(error_message) from e\n\n\ndef evaluate_name(\n    name: ast.Name,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> Any:\n    if name.id in state:\n        return state[name.id]\n    elif name.id in static_tools:\n        return safer_func(static_tools[name.id], static_tools=static_tools, authorized_imports=authorized_imports)\n    elif name.id in custom_tools:\n        return custom_tools[name.id]\n    elif name.id in ERRORS:\n        return ERRORS[name.id]\n    close_matches = difflib.get_close_matches(name.id, list(state.keys()))\n    if len(close_matches) > 0:\n        return state[close_matches[0]]\n    raise InterpreterError(f\"The variable `{name.id}` is not defined.\")\n\n\ndef evaluate_condition(\n    condition: ast.Compare,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> bool | object:\n    result = True\n    left = evaluate_ast(condition.left, state, static_tools, custom_tools, authorized_imports)\n    for i, (op, comparator) in enumerate(zip(condition.ops, condition.comparators)):\n        op = type(op)\n        right = evaluate_ast(comparator, state, static_tools, custom_tools, authorized_imports)\n        if op == ast.Eq:\n            current_result = left == right\n        elif op == ast.NotEq:\n            current_result = left != right\n        elif op == ast.Lt:\n            current_result = left < right\n        elif op == ast.LtE:\n            current_result = left <= right\n        elif op == ast.Gt:\n            current_result = left > right\n        elif op == ast.GtE:\n            current_result = left >= right\n        elif op == ast.Is:\n            current_result = left is right\n        elif op == ast.IsNot:\n            current_result = left is not right\n        elif op == ast.In:\n            current_result = left in right\n        elif op == ast.NotIn:\n            current_result = left not in right\n        else:\n            raise InterpreterError(f\"Unsupported comparison operator: {op}\")\n\n        if current_result is False:\n            return False\n        result = current_result if i == 0 else (result and current_result)\n        left = right\n    return result\n\n\ndef evaluate_if(\n    if_statement: ast.If,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> Any:\n    result = None\n    test_result = evaluate_ast(if_statement.test, state, static_tools, custom_tools, authorized_imports)\n    if test_result:\n        for line in if_statement.body:\n            line_result = evaluate_ast(line, state, static_tools, custom_tools, authorized_imports)\n            if line_result is not None:\n                result = line_result\n    else:\n        for line in if_statement.orelse:\n            line_result = evaluate_ast(line, state, static_tools, custom_tools, authorized_imports)\n            if line_result is not None:\n                result = line_result\n    return result\n\n\ndef evaluate_for(\n    for_loop: ast.For,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> Any:\n    result = None\n    iterator = evaluate_ast(for_loop.iter, state, static_tools, custom_tools, authorized_imports)\n    for counter in iterator:\n        set_value(\n            for_loop.target,\n            counter,\n            state,\n            static_tools,\n            custom_tools,\n            authorized_imports,\n        )\n        for node in for_loop.body:\n            try:\n                line_result = evaluate_ast(node, state, static_tools, custom_tools, authorized_imports)\n                if line_result is not None:\n                    result = line_result\n            except BreakException:\n                return result\n            except ContinueException:\n                break\n    return result\n\n\ndef _evaluate_comprehensions(\n    comprehensions: list[ast.comprehension],\n    evaluate_element: Callable[[dict[str, Any]], Any],\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> Generator[Any, None, None]:\n    \"\"\"\n    Recursively evaluate nested comprehensions and yields elements.\n\n    Args:\n        comprehensions (`list[ast.comprehension]`): Comprehensions to evaluate.\n        evaluate_element (`Callable`): Function that evaluates the final element when comprehensions are exhausted.\n        state (`dict[str, Any]`): Current evaluation state.\n        static_tools (`dict[str, Callable]`): Static tools.\n        custom_tools (`dict[str, Callable]`): Custom tools.\n        authorized_imports (`list[str]`): Authorized imports.\n\n    Yields:\n        `Any`: Individual elements produced by the comprehension\n    \"\"\"\n    # Base case: no more comprehensions\n    if not comprehensions:\n        yield evaluate_element(state)\n        return\n    # Evaluate first comprehension\n    comprehension = comprehensions[0]\n    iter_value = evaluate_ast(comprehension.iter, state, static_tools, custom_tools, authorized_imports)\n    for value in iter_value:\n        new_state = state.copy()\n        set_value(comprehension.target, value, new_state, static_tools, custom_tools, authorized_imports)\n        # Check all filter conditions\n        if all(\n            evaluate_ast(if_clause, new_state, static_tools, custom_tools, authorized_imports)\n            for if_clause in comprehension.ifs\n        ):\n            # Recurse with remaining comprehensions\n            yield from _evaluate_comprehensions(\n                comprehensions[1:], evaluate_element, new_state, static_tools, custom_tools, authorized_imports\n            )\n\n\ndef evaluate_listcomp(\n    listcomp: ast.ListComp,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> list[Any]:\n    return list(\n        _evaluate_comprehensions(\n            listcomp.generators,\n            lambda comp_state: evaluate_ast(listcomp.elt, comp_state, static_tools, custom_tools, authorized_imports),\n            state,\n            static_tools,\n            custom_tools,\n            authorized_imports,\n        )\n    )\n\n\ndef evaluate_setcomp(\n    setcomp: ast.SetComp,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> set[Any]:\n    return set(\n        _evaluate_comprehensions(\n            setcomp.generators,\n            lambda comp_state: evaluate_ast(setcomp.elt, comp_state, static_tools, custom_tools, authorized_imports),\n            state,\n            static_tools,\n            custom_tools,\n            authorized_imports,\n        )\n    )\n\n\ndef evaluate_dictcomp(\n    dictcomp: ast.DictComp,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> dict[Any, Any]:\n    return dict(\n        _evaluate_comprehensions(\n            dictcomp.generators,\n            lambda comp_state: (\n                evaluate_ast(dictcomp.key, comp_state, static_tools, custom_tools, authorized_imports),\n                evaluate_ast(dictcomp.value, comp_state, static_tools, custom_tools, authorized_imports),\n            ),\n            state,\n            static_tools,\n            custom_tools,\n            authorized_imports,\n        )\n    )\n\n\ndef evaluate_try(\n    try_node: ast.Try,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> None:\n    try:\n        for stmt in try_node.body:\n            evaluate_ast(stmt, state, static_tools, custom_tools, authorized_imports)\n    except Exception as e:\n        matched = False\n        for handler in try_node.handlers:\n            if handler.type is None or isinstance(\n                e,\n                evaluate_ast(handler.type, state, static_tools, custom_tools, authorized_imports),\n            ):\n                matched = True\n                if handler.name:\n                    state[handler.name] = e\n                for stmt in handler.body:\n                    evaluate_ast(stmt, state, static_tools, custom_tools, authorized_imports)\n                break\n        if not matched:\n            raise e\n    else:\n        if try_node.orelse:\n            for stmt in try_node.orelse:\n                evaluate_ast(stmt, state, static_tools, custom_tools, authorized_imports)\n    finally:\n        if try_node.finalbody:\n            for stmt in try_node.finalbody:\n                evaluate_ast(stmt, state, static_tools, custom_tools, authorized_imports)\n\n\ndef evaluate_raise(\n    raise_node: ast.Raise,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> None:\n    if raise_node.exc is not None:\n        exc = evaluate_ast(raise_node.exc, state, static_tools, custom_tools, authorized_imports)\n    else:\n        exc = None\n    if raise_node.cause is not None:\n        cause = evaluate_ast(raise_node.cause, state, static_tools, custom_tools, authorized_imports)\n    else:\n        cause = None\n    if exc is not None:\n        if cause is not None:\n            raise exc from cause\n        else:\n            raise exc\n    else:\n        raise InterpreterError(\"Re-raise is not supported without an active exception\")\n\n\ndef evaluate_assert(\n    assert_node: ast.Assert,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> None:\n    test_result = evaluate_ast(assert_node.test, state, static_tools, custom_tools, authorized_imports)\n    if not test_result:\n        if assert_node.msg:\n            msg = evaluate_ast(assert_node.msg, state, static_tools, custom_tools, authorized_imports)\n            raise AssertionError(msg)\n        else:\n            # Include the failing condition in the assertion message\n            test_code = ast.unparse(assert_node.test)\n            raise AssertionError(f\"Assertion failed: {test_code}\")\n\n\ndef evaluate_with(\n    with_node: ast.With,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> None:\n    contexts = []\n    for item in with_node.items:\n        context_expr = evaluate_ast(item.context_expr, state, static_tools, custom_tools, authorized_imports)\n        enter_result = context_expr.__enter__()\n        contexts.append(context_expr)\n        if item.optional_vars:\n            state[item.optional_vars.id] = enter_result\n\n    try:\n        for stmt in with_node.body:\n            evaluate_ast(stmt, state, static_tools, custom_tools, authorized_imports)\n    except Exception as e:\n        # exc_info tracks the active exception as we unwind (from innermost context manager)\n        # Resetting it to (None, None, None) signals suppression to the remaining outer managers\n        exc_info = (type(e), e, e.__traceback__)\n        for context in reversed(contexts):\n            try:\n                if context.__exit__(*exc_info):\n                    exc_info = (None, None, None)  # suppressed; outer CMs see no exception\n            except Exception as exit_exc:\n                exc_info = (type(exit_exc), exit_exc, exit_exc.__traceback__)  # new exc replaces active\n        if exc_info[1] is not None:\n            raise exc_info[1].with_traceback(exc_info[2])\n    else:\n        for context in reversed(contexts):\n            context.__exit__(None, None, None)\n\n\ndef get_safe_module(raw_module, authorized_imports, visited=None):\n    \"\"\"Creates a safe copy of a module or returns the original if it's a function\"\"\"\n    # If it's a function or non-module object, return it directly\n    if not isinstance(raw_module, ModuleType):\n        return raw_module\n\n    # Handle circular references: Initialize visited set for the first call\n    if visited is None:\n        visited = set()\n\n    module_id = id(raw_module)\n    if module_id in visited:\n        return raw_module  # Return original for circular refs\n\n    visited.add(module_id)\n\n    # Create new module for actual modules\n    safe_module = ModuleType(raw_module.__name__)\n\n    # Copy all attributes by reference, recursively checking modules\n    for attr_name in dir(raw_module):\n        try:\n            attr_value = getattr(raw_module, attr_name)\n        except (ImportError, AttributeError) as e:\n            # lazy / dynamic loading module -> INFO log and skip\n            logger.info(\n                f\"Skipping import error while copying {raw_module.__name__}.{attr_name}: {type(e).__name__} - {e}\"\n            )\n            continue\n        # Recursively process nested modules, passing visited set\n        if isinstance(attr_value, ModuleType):\n            attr_value = get_safe_module(attr_value, authorized_imports, visited=visited)\n\n        setattr(safe_module, attr_name, attr_value)\n\n    return safe_module\n\n\ndef evaluate_import(expression, state, authorized_imports):\n    if isinstance(expression, ast.Import):\n        for alias in expression.names:\n            if check_import_authorized(alias.name, authorized_imports):\n                raw_module = import_module(alias.name)\n                state[alias.asname or alias.name] = get_safe_module(raw_module, authorized_imports)\n            else:\n                raise InterpreterError(\n                    f\"Import of {alias.name} is not allowed. Authorized imports are: {str(authorized_imports)}\"\n                )\n        return None\n    elif isinstance(expression, ast.ImportFrom):\n        if check_import_authorized(expression.module, authorized_imports):\n            raw_module = __import__(expression.module, fromlist=[alias.name for alias in expression.names])\n            module = get_safe_module(raw_module, authorized_imports)\n            if expression.names[0].name == \"*\":  # Handle \"from module import *\"\n                if hasattr(module, \"__all__\"):  # If module has __all__, import only those names\n                    for name in module.__all__:\n                        state[name] = getattr(module, name)\n                else:  # If no __all__, import all public names (those not starting with '_')\n                    for name in dir(module):\n                        if not name.startswith(\"_\"):\n                            state[name] = getattr(module, name)\n            else:  # regular from imports\n                for alias in expression.names:\n                    if hasattr(module, alias.name):\n                        state[alias.asname or alias.name] = getattr(module, alias.name)\n                    else:\n                        raise InterpreterError(f\"Module {expression.module} has no attribute {alias.name}\")\n        else:\n            raise InterpreterError(\n                f\"Import from {expression.module} is not allowed. Authorized imports are: {str(authorized_imports)}\"\n            )\n        return None\n\n\ndef evaluate_generatorexp(\n    genexp: ast.GeneratorExp,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> Generator[Any]:\n    def generator():\n        for gen in genexp.generators:\n            iter_value = evaluate_ast(gen.iter, state, static_tools, custom_tools, authorized_imports)\n            for value in iter_value:\n                new_state = state.copy()\n                set_value(\n                    gen.target,\n                    value,\n                    new_state,\n                    static_tools,\n                    custom_tools,\n                    authorized_imports,\n                )\n                if all(\n                    evaluate_ast(if_clause, new_state, static_tools, custom_tools, authorized_imports)\n                    for if_clause in gen.ifs\n                ):\n                    yield evaluate_ast(\n                        genexp.elt,\n                        new_state,\n                        static_tools,\n                        custom_tools,\n                        authorized_imports,\n                    )\n\n    return generator()\n\n\ndef evaluate_delete(\n    delete_node: ast.Delete,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str],\n) -> None:\n    \"\"\"\n    Evaluate a delete statement (del x, del x[y]).\n\n    Args:\n        delete_node: The AST Delete node to evaluate\n        state: The current state dictionary\n        static_tools: Dictionary of static tools\n        custom_tools: Dictionary of custom tools\n        authorized_imports: List of authorized imports\n    \"\"\"\n    for target in delete_node.targets:\n        if isinstance(target, ast.Name):\n            # Handle simple variable deletion (del x)\n            if target.id in state:\n                del state[target.id]\n            else:\n                raise InterpreterError(f\"Cannot delete name '{target.id}': name is not defined\")\n        elif isinstance(target, ast.Subscript):\n            # Handle index/key deletion (del x[y])\n            obj = evaluate_ast(target.value, state, static_tools, custom_tools, authorized_imports)\n            index = evaluate_ast(target.slice, state, static_tools, custom_tools, authorized_imports)\n            try:\n                del obj[index]\n            except (TypeError, KeyError, IndexError) as e:\n                raise InterpreterError(f\"Cannot delete index/key: {str(e)}\")\n        else:\n            raise InterpreterError(f\"Deletion of {type(target).__name__} targets is not supported\")\n\n\n@safer_eval\ndef evaluate_ast(\n    expression: ast.AST,\n    state: dict[str, Any],\n    static_tools: dict[str, Callable],\n    custom_tools: dict[str, Callable],\n    authorized_imports: list[str] = BASE_BUILTIN_MODULES,\n):\n    \"\"\"\n    Evaluate an abstract syntax tree using the content of the variables stored in a state and only evaluating a given\n    set of functions.\n\n    This function will recurse through the nodes of the tree provided.\n\n    Args:\n        expression (`ast.AST`):\n            The code to evaluate, as an abstract syntax tree.\n        state (`Dict[str, Any]`):\n            A dictionary mapping variable names to values. The `state` is updated if need be when the evaluation\n            encounters assignments.\n        static_tools (`Dict[str, Callable]`):\n            Functions that may be called during the evaluation. Trying to change one of these static_tools will raise an error.\n        custom_tools (`Dict[str, Callable]`):\n            Functions that may be called during the evaluation. These custom_tools can be overwritten.\n        authorized_imports (`List[str]`):\n            The list of modules that can be imported by the code. By default, only a few safe modules are allowed.\n            If it contains \"*\", it will authorize any import. Use this at your own risk!\n    \"\"\"\n    if state.setdefault(\"_operations_count\", {\"counter\": 0})[\"counter\"] >= MAX_OPERATIONS:\n        raise InterpreterError(\n            f\"Reached the max number of operations of {MAX_OPERATIONS}. Maybe there is an infinite loop somewhere in the code, or you're just asking too many calculations.\"\n        )\n    state[\"_operations_count\"][\"counter\"] += 1\n    common_params = (state, static_tools, custom_tools, authorized_imports)\n    if isinstance(expression, ast.Assign):\n        # Assignment -> we evaluate the assignment which should update the state\n        # We return the variable assigned as it may be used to determine the final result.\n        return evaluate_assign(expression, *common_params)\n    elif isinstance(expression, ast.AnnAssign):\n        return evaluate_annassign(expression, *common_params)\n    elif isinstance(expression, ast.AugAssign):\n        return evaluate_augassign(expression, *common_params)\n    elif isinstance(expression, ast.Call):\n        # Function call -> we return the value of the function call\n        return evaluate_call(expression, *common_params)\n    elif isinstance(expression, ast.Constant):\n        # Constant -> just return the value\n        return expression.value\n    elif isinstance(expression, ast.Tuple):\n        return tuple((evaluate_ast(elt, *common_params) for elt in expression.elts))\n    elif isinstance(expression, ast.GeneratorExp):\n        return evaluate_generatorexp(expression, *common_params)\n    elif isinstance(expression, ast.ListComp):\n        return evaluate_listcomp(expression, *common_params)\n    elif isinstance(expression, ast.DictComp):\n        return evaluate_dictcomp(expression, *common_params)\n    elif isinstance(expression, ast.SetComp):\n        return evaluate_setcomp(expression, *common_params)\n    elif isinstance(expression, ast.UnaryOp):\n        return evaluate_unaryop(expression, *common_params)\n    elif isinstance(expression, ast.Starred):\n        return evaluate_ast(expression.value, *common_params)\n    elif isinstance(expression, ast.BoolOp):\n        # Boolean operation -> evaluate the operation\n        return evaluate_boolop(expression, *common_params)\n    elif isinstance(expression, ast.Break):\n        raise BreakException()\n    elif isinstance(expression, ast.Continue):\n        raise ContinueException()\n    elif isinstance(expression, ast.BinOp):\n        # Binary operation -> execute operation\n        return evaluate_binop(expression, *common_params)\n    elif isinstance(expression, ast.Compare):\n        # Comparison -> evaluate the comparison\n        return evaluate_condition(expression, *common_params)\n    elif isinstance(expression, ast.Lambda):\n        return evaluate_lambda(expression, *common_params)\n    elif isinstance(expression, ast.FunctionDef):\n        return evaluate_function_def(expression, *common_params)\n    elif isinstance(expression, ast.Dict):\n        # Dict -> evaluate all keys and values\n        keys = (evaluate_ast(k, *common_params) for k in expression.keys)\n        values = (evaluate_ast(v, *common_params) for v in expression.values)\n        return dict(zip(keys, values))\n    elif isinstance(expression, ast.Expr):\n        # Expression -> evaluate the content\n        return evaluate_ast(expression.value, *common_params)\n    elif isinstance(expression, ast.For):\n        # For loop -> execute the loop\n        return evaluate_for(expression, *common_params)\n    elif isinstance(expression, ast.FormattedValue):\n        # Formatted value (part of f-string) -> evaluate the content and format it\n        value = evaluate_ast(expression.value, *common_params)\n        # Early return if no format spec\n        if not expression.format_spec:\n            return value\n        # Apply format specification\n        format_spec = evaluate_ast(expression.format_spec, *common_params)\n        return format(value, format_spec)\n    elif isinstance(expression, ast.If):\n        # If -> execute the right branch\n        return evaluate_if(expression, *common_params)\n    elif hasattr(ast, \"Index\") and isinstance(expression, ast.Index):\n        return evaluate_ast(expression.value, *common_params)\n    elif isinstance(expression, ast.JoinedStr):\n        return \"\".join([str(evaluate_ast(v, *common_params)) for v in expression.values])\n    elif isinstance(expression, ast.List):\n        # List -> evaluate all elements\n        return [evaluate_ast(elt, *common_params) for elt in expression.elts]\n    elif isinstance(expression, ast.Name):\n        # Name -> pick up the value in the state\n        return evaluate_name(expression, *common_params)\n    elif isinstance(expression, ast.Subscript):\n        # Subscript -> return the value of the indexing\n        return evaluate_subscript(expression, *common_params)\n    elif isinstance(expression, ast.IfExp):\n        test_val = evaluate_ast(expression.test, *common_params)\n        if test_val:\n            return evaluate_ast(expression.body, *common_params)\n        else:\n            return evaluate_ast(expression.orelse, *common_params)\n    elif isinstance(expression, ast.Attribute):\n        return evaluate_attribute(expression, *common_params)\n    elif isinstance(expression, ast.Slice):\n        return slice(\n            evaluate_ast(expression.lower, *common_params) if expression.lower is not None else None,\n            evaluate_ast(expression.upper, *common_params) if expression.upper is not None else None,\n            evaluate_ast(expression.step, *common_params) if expression.step is not None else None,\n        )\n    elif isinstance(expression, ast.While):\n        return evaluate_while(expression, *common_params)\n    elif isinstance(expression, (ast.Import, ast.ImportFrom)):\n        return evaluate_import(expression, state, authorized_imports)\n    elif isinstance(expression, ast.ClassDef):\n        return evaluate_class_def(expression, *common_params)\n    elif isinstance(expression, ast.Try):\n        return evaluate_try(expression, *common_params)\n    elif isinstance(expression, ast.Raise):\n        return evaluate_raise(expression, *common_params)\n    elif isinstance(expression, ast.Assert):\n        return evaluate_assert(expression, *common_params)\n    elif isinstance(expression, ast.With):\n        return evaluate_with(expression, *common_params)\n    elif isinstance(expression, ast.Set):\n        return set((evaluate_ast(elt, *common_params) for elt in expression.elts))\n    elif isinstance(expression, ast.Return):\n        raise ReturnException(evaluate_ast(expression.value, *common_params) if expression.value else None)\n    elif isinstance(expression, ast.Pass):\n        return None\n    elif isinstance(expression, ast.Delete):\n        return evaluate_delete(expression, *common_params)\n    else:\n        # For now we refuse anything else. Let's add things as we need them.\n        raise InterpreterError(f\"{expression.__class__.__name__} is not supported.\")\n\n\nclass FinalAnswerException(BaseException):\n    \"\"\"Exception raised when final_answer is called.\n\n    Inherits from BaseException instead of Exception to prevent being caught\n    by generic `except Exception` clauses in agent-generated code.\n    \"\"\"\n\n    def __init__(self, value):\n        self.value = value\n\n\ndef evaluate_python_code(\n    code: str,\n    static_tools: dict[str, Callable] | None = None,\n    custom_tools: dict[str, Callable] | None = None,\n    state: dict[str, Any] | None = None,\n    authorized_imports: list[str] = BASE_BUILTIN_MODULES,\n    max_print_outputs_length: int = DEFAULT_MAX_LEN_OUTPUT,\n    timeout_seconds: int | None = MAX_EXECUTION_TIME_SECONDS,\n):\n    \"\"\"\n    Evaluate a python expression using the content of the variables stored in a state and only evaluating a given set\n    of functions.\n\n    This function will recurse through the nodes of the tree provided.\n\n    Args:\n        code (`str`):\n            The code to evaluate.\n        static_tools (`Dict[str, Callable]`):\n            The functions that may be called during the evaluation. These can also be agents in a multiagent setting.\n            These tools cannot be overwritten in the code: any assignment to their name will raise an error.\n        custom_tools (`Dict[str, Callable]`):\n            The functions that may be called during the evaluation.\n            These tools can be overwritten in the code: any assignment to their name will overwrite them.\n        state (`Dict[str, Any]`):\n            A dictionary mapping variable names to values. The `state` should contain the initial inputs but will be\n            updated by this function to contain all variables as they are evaluated.\n            The print outputs will be stored in the state under the key \"_print_outputs\".\n        timeout_seconds (`int`, *optional*, defaults to `MAX_EXECUTION_TIME_SECONDS`):\n            Maximum time in seconds allowed for code execution. Set to `None` to disable timeout.\n    \"\"\"\n    try:\n        expression = ast.parse(code)\n    except SyntaxError as e:\n        raise InterpreterError(\n            f\"Code parsing failed on line {e.lineno} due to: {type(e).__name__}: {str(e)}\\n\"\n            f\"{e.text}\"\n            f\"{' ' * (e.offset or 0)}^\"\n        )\n\n    if state is None:\n        state = {}\n    static_tools = static_tools.copy() if static_tools is not None else {}\n    custom_tools = custom_tools if custom_tools is not None else {}\n    state[\"_print_outputs\"] = PrintContainer()\n    state[\"_operations_count\"] = {\"counter\": 0}\n\n    if \"final_answer\" in static_tools:\n        previous_final_answer = static_tools[\"final_answer\"]\n\n        def final_answer(*args, **kwargs):  # Allow arbitrary arguments to be passed\n            raise FinalAnswerException(previous_final_answer(*args, **kwargs))\n\n        static_tools[\"final_answer\"] = final_answer\n\n    # Define the actual execution logic\n    def _execute_code():\n        result = None\n        try:\n            for node in expression.body:\n                result = evaluate_ast(node, state, static_tools, custom_tools, authorized_imports)\n            state[\"_print_outputs\"].value = truncate_content(\n                str(state[\"_print_outputs\"]), max_length=max_print_outputs_length\n            )\n            is_final_answer = False\n            return result, is_final_answer\n        except FinalAnswerException as e:\n            state[\"_print_outputs\"].value = truncate_content(\n                str(state[\"_print_outputs\"]), max_length=max_print_outputs_length\n            )\n            is_final_answer = True\n            return e.value, is_final_answer\n        except Exception as e:\n            state[\"_print_outputs\"].value = truncate_content(\n                str(state[\"_print_outputs\"]), max_length=max_print_outputs_length\n            )\n            raise InterpreterError(\n                f\"Code execution failed at line '{ast.get_source_segment(code, node)}' due to: {type(e).__name__}: {e}\"\n            )\n\n    # Apply timeout if specified\n    if timeout_seconds is not None:\n        _execute_code = timeout(timeout_seconds)(_execute_code)\n\n    return _execute_code()\n\n\n@dataclass\nclass CodeOutput:\n    output: Any\n    logs: str\n    is_final_answer: bool\n\n\nclass PythonExecutor(ABC):\n    @abstractmethod\n    def send_tools(self, tools: dict[str, Tool]) -> None: ...\n\n    @abstractmethod\n    def send_variables(self, variables: dict[str, Any]) -> None: ...\n\n    @abstractmethod\n    def __call__(self, code_action: str) -> CodeOutput: ...\n\n\nclass LocalPythonExecutor(PythonExecutor):\n    \"\"\"\n    Executor of Python code in a local environment.\n\n    This executor evaluates Python code with restricted access to imports and built-in functions,\n    making it suitable for running untrusted code. It maintains state between executions,\n    allows for custom tools and functions to be made available to the code, and captures\n    print outputs separately from return values.\n\n    Args:\n        additional_authorized_imports (`list[str]`):\n            Additional authorized imports for the executor.\n        max_print_outputs_length (`int`, defaults to `DEFAULT_MAX_LEN_OUTPUT=50_000`):\n            Maximum length of the print outputs.\n        additional_functions (`dict[str, Callable]`, *optional*):\n            Additional Python functions to be added to the executor.\n        timeout_seconds (`int`, *optional*, defaults to `MAX_EXECUTION_TIME_SECONDS`):\n            Maximum time in seconds allowed for code execution. Set to `None` to disable timeout.\n    \"\"\"\n\n    def __init__(\n        self,\n        additional_authorized_imports: list[str],\n        max_print_outputs_length: int | None = None,\n        additional_functions: dict[str, Callable] | None = None,\n        timeout_seconds: int | None = MAX_EXECUTION_TIME_SECONDS,\n    ):\n        self.custom_tools = {}\n        self.state = {\"__name__\": \"__main__\"}\n        self.max_print_outputs_length = max_print_outputs_length\n        if max_print_outputs_length is None:\n            self.max_print_outputs_length = DEFAULT_MAX_LEN_OUTPUT\n        self.additional_authorized_imports = additional_authorized_imports\n        self.authorized_imports = list(set(BASE_BUILTIN_MODULES) | set(self.additional_authorized_imports))\n        self._check_authorized_imports_are_installed()\n        self.static_tools = None\n        self.additional_functions = additional_functions or {}\n        self.timeout_seconds = timeout_seconds\n\n    def _check_authorized_imports_are_installed(self):\n        \"\"\"\n        Check that all authorized imports are installed on the system.\n\n        Handles wildcard imports (\"*\") and partial star-pattern imports (e.g., \"os.*\").\n\n        Raises:\n            InterpreterError: If any of the authorized modules are not installed.\n        \"\"\"\n        missing_modules = [\n            base_module\n            for imp in self.authorized_imports\n            if imp != \"*\" and find_spec(base_module := imp.split(\".\")[0]) is None\n        ]\n        if missing_modules:\n            raise InterpreterError(\n                f\"Non-installed authorized modules: {', '.join(missing_modules)}. \"\n                f\"Please install these modules or remove them from the authorized imports list.\"\n            )\n\n    def __call__(self, code_action: str) -> CodeOutput:\n        output, is_final_answer = evaluate_python_code(\n            code_action,\n            static_tools=self.static_tools,\n            custom_tools=self.custom_tools,\n            state=self.state,\n            authorized_imports=self.authorized_imports,\n            max_print_outputs_length=self.max_print_outputs_length,\n            timeout_seconds=self.timeout_seconds,\n        )\n        logs = str(self.state[\"_print_outputs\"])\n        return CodeOutput(output=output, logs=logs, is_final_answer=is_final_answer)\n\n    def send_variables(self, variables: dict[str, Any]):\n        self.state.update(variables)\n\n    def send_tools(self, tools: dict[str, Tool]):\n        # Combine agent tools, base Python tools, and additional Python functions\n        self.static_tools = {**tools, **BASE_PYTHON_TOOLS.copy(), **self.additional_functions}\n\n\n__all__ = [\"evaluate_python_code\", \"LocalPythonExecutor\"]\n"
  },
  {
    "path": "src/smolagents/mcp_client.py",
    "content": "#!/usr/bin/env python\n# coding=utf-8\n\n# Copyright 2025 The HuggingFace Inc. team. All rights reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\nfrom __future__ import annotations\n\nimport warnings\nfrom types import TracebackType\nfrom typing import TYPE_CHECKING, Any\n\nfrom smolagents.tools import Tool\n\n\n__all__ = [\"MCPClient\"]\n\nif TYPE_CHECKING:\n    from mcpadapt.core import StdioServerParameters\n\n\nclass MCPClient:\n    \"\"\"Manages the connection to an MCP server and make its tools available to SmolAgents.\n\n    Note: tools can only be accessed after the connection has been started with the\n        `connect()` method, done during the init. If you don't use the context manager\n        we strongly encourage to use \"try ... finally\" to ensure the connection is cleaned up.\n\n    Args:\n        server_parameters (StdioServerParameters | dict[str, Any] | list[StdioServerParameters | dict[str, Any]]):\n            Configuration parameters to connect to the MCP server. Can be a list if you want to connect multiple MCPs at once.\n\n            - An instance of `mcp.StdioServerParameters` for connecting a Stdio MCP server via standard input/output using a subprocess.\n\n            - A `dict` with at least:\n              - \"url\": URL of the server.\n              - \"transport\": Transport protocol to use, one of:\n                - \"streamable-http\": Streamable HTTP transport (default).\n                - \"sse\": Legacy HTTP+SSE transport (deprecated).\n        adapter_kwargs (dict[str, Any], optional):\n            Additional keyword arguments to be passed directly to `MCPAdapt`.\n        structured_output (bool, optional, defaults to False):\n            Whether to enable structured output features for MCP tools. If True, enables:\n            - Support for outputSchema in MCP tools\n            - Structured content handling (structuredContent from MCP responses)\n            - JSON parsing fallback for structured data\n            If False, uses the original simple text-only behavior for backwards compatibility.\n\n    Example:\n        ```python\n        # fully managed context manager + stdio\n        with MCPClient(...) as tools:\n            # tools are now available\n\n        # context manager + Streamable HTTP transport:\n        with MCPClient({\"url\": \"http://localhost:8000/mcp\", \"transport\": \"streamable-http\"}) as tools:\n            # tools are now available\n\n        # Enable structured output for advanced MCP tools:\n        with MCPClient(server_parameters, structured_output=True) as tools:\n            # tools with structured output support are now available\n\n        # manually manage the connection via the mcp_client object:\n        try:\n            mcp_client = MCPClient(...)\n            tools = mcp_client.get_tools()\n\n            # use your tools here.\n        finally:\n            mcp_client.disconnect()\n        ```\n    \"\"\"\n\n    def __init__(\n        self,\n        server_parameters: \"StdioServerParameters\" | dict[str, Any] | list[\"StdioServerParameters\" | dict[str, Any]],\n        adapter_kwargs: dict[str, Any] | None = None,\n        structured_output: bool | None = None,\n    ):\n        # Handle future warning for structured_output default value change\n        if structured_output is None:\n            warnings.warn(\n                \"Parameter 'structured_output' was not specified. \"\n                \"Currently it defaults to False, but in version 1.25, the default will change to True. \"\n                \"To suppress this warning, explicitly set structured_output=True (new behavior) or structured_output=False (legacy behavior). \"\n                \"See documentation at https://huggingface.co/docs/smolagents/tutorials/tools#structured-output-and-output-schema-support for more details.\",\n                FutureWarning,\n                stacklevel=2,\n            )\n            structured_output = False\n\n        try:\n            from mcpadapt.core import MCPAdapt\n            from mcpadapt.smolagents_adapter import SmolAgentsAdapter\n        except ModuleNotFoundError:\n            raise ModuleNotFoundError(\"Please install 'mcp' extra to use MCPClient: `pip install 'smolagents[mcp]'`\")\n        if isinstance(server_parameters, dict):\n            transport = server_parameters.get(\"transport\")\n            if transport is None:\n                transport = \"streamable-http\"\n                server_parameters[\"transport\"] = transport\n            if transport not in {\"sse\", \"streamable-http\"}:\n                raise ValueError(\n                    f\"Unsupported transport: {transport}. Supported transports are 'streamable-http' and 'sse'.\"\n                )\n        adapter_kwargs = adapter_kwargs or {}\n        self._adapter = MCPAdapt(\n            server_parameters, SmolAgentsAdapter(structured_output=structured_output), **adapter_kwargs\n        )\n        self._tools: list[Tool] | None = None\n        self.connect()\n\n    def connect(self):\n        \"\"\"Connect to the MCP server and initialize the tools.\"\"\"\n        self._tools: list[Tool] = self._adapter.__enter__()\n\n    def disconnect(\n        self,\n        exc_type: type[BaseException] | None = None,\n        exc_value: BaseException | None = None,\n        exc_traceback: TracebackType | None = None,\n    ):\n        \"\"\"Disconnect from the MCP server\"\"\"\n        self._adapter.__exit__(exc_type, exc_value, exc_traceback)\n\n    def get_tools(self) -> list[Tool]:\n        \"\"\"The SmolAgents tools available from the MCP server.\n\n        Note: for now, this always returns the tools available at the creation of the session,\n        but it will in a future release return also new tools available from the MCP server if\n        any at call time.\n\n        Raises:\n            ValueError: If the MCP server tools is None (usually assuming the server is not started).\n\n        Returns:\n            list[Tool]: The SmolAgents tools available from the MCP server.\n        \"\"\"\n        if self._tools is None:\n            raise ValueError(\n                \"Couldn't retrieve tools from MCP server, run `mcp_client.connect()` first before accessing `tools`\"\n            )\n        return self._tools\n\n    def __enter__(self) -> list[Tool]:\n        \"\"\"Connect to the MCP server and return the tools directly.\n\n        Note that because of the `.connect` in the init, the mcp_client\n        is already connected at this point.\n        \"\"\"\n        return self._tools\n\n    def __exit__(\n        self,\n        exc_type: type[BaseException] | None,\n        exc_value: BaseException | None,\n        exc_traceback: TracebackType | None,\n    ):\n        \"\"\"Disconnect from the MCP server.\"\"\"\n        self.disconnect(exc_type, exc_value, exc_traceback)\n"
  },
  {
    "path": "src/smolagents/memory.py",
    "content": "import inspect\nfrom dataclasses import asdict, dataclass\nfrom logging import getLogger\nfrom typing import TYPE_CHECKING, Any, Callable, Type\n\nfrom smolagents.models import ChatMessage, MessageRole, get_dict_from_nested_dataclasses\nfrom smolagents.monitoring import AgentLogger, LogLevel, Timing, TokenUsage\nfrom smolagents.utils import AgentError, make_json_serializable\n\n\nif TYPE_CHECKING:\n    import PIL.Image\n\n    from smolagents.models import ChatMessage\n    from smolagents.monitoring import AgentLogger\n\n\n__all__ = [\"AgentMemory\"]\n\n\nlogger = getLogger(__name__)\n\n\n@dataclass\nclass ToolCall:\n    name: str\n    arguments: Any\n    id: str\n\n    def dict(self):\n        return {\n            \"id\": self.id,\n            \"type\": \"function\",\n            \"function\": {\n                \"name\": self.name,\n                \"arguments\": make_json_serializable(self.arguments),\n            },\n        }\n\n\n@dataclass\nclass MemoryStep:\n    def dict(self):\n        return asdict(self)\n\n    def to_messages(self, summary_mode: bool = False) -> list[ChatMessage]:\n        raise NotImplementedError\n\n\n@dataclass\nclass ActionStep(MemoryStep):\n    step_number: int\n    timing: Timing\n    model_input_messages: list[ChatMessage] | None = None\n    tool_calls: list[ToolCall] | None = None\n    error: AgentError | None = None\n    model_output_message: ChatMessage | None = None\n    model_output: str | list[dict[str, Any]] | None = None\n    code_action: str | None = None\n    observations: str | None = None\n    observations_images: list[\"PIL.Image.Image\"] | None = None\n    action_output: Any = None\n    token_usage: TokenUsage | None = None\n    is_final_answer: bool = False\n\n    def dict(self):\n        # We overwrite the method to parse the tool_calls and action_output manually\n        return {\n            \"step_number\": self.step_number,\n            \"timing\": self.timing.dict(),\n            \"model_input_messages\": [\n                make_json_serializable(get_dict_from_nested_dataclasses(msg)) for msg in self.model_input_messages\n            ]\n            if self.model_input_messages\n            else None,\n            \"tool_calls\": [tc.dict() for tc in self.tool_calls] if self.tool_calls else [],\n            \"error\": self.error.dict() if self.error else None,\n            \"model_output_message\": make_json_serializable(get_dict_from_nested_dataclasses(self.model_output_message))\n            if self.model_output_message\n            else None,\n            \"model_output\": self.model_output,\n            \"code_action\": self.code_action,\n            \"observations\": self.observations,\n            \"observations_images\": [image.tobytes() for image in self.observations_images]\n            if self.observations_images\n            else None,\n            \"action_output\": make_json_serializable(self.action_output),\n            \"token_usage\": asdict(self.token_usage) if self.token_usage else None,\n            \"is_final_answer\": self.is_final_answer,\n        }\n\n    def to_messages(self, summary_mode: bool = False) -> list[ChatMessage]:\n        messages = []\n        if self.model_output is not None and not summary_mode:\n            messages.append(\n                ChatMessage(role=MessageRole.ASSISTANT, content=[{\"type\": \"text\", \"text\": self.model_output.strip()}])\n            )\n\n        if self.tool_calls is not None:\n            messages.append(\n                ChatMessage(\n                    role=MessageRole.TOOL_CALL,\n                    content=[\n                        {\n                            \"type\": \"text\",\n                            \"text\": \"Calling tools:\\n\" + str([tc.dict() for tc in self.tool_calls]),\n                        }\n                    ],\n                )\n            )\n\n        if self.observations_images:\n            messages.append(\n                ChatMessage(\n                    role=MessageRole.USER,\n                    content=[\n                        {\n                            \"type\": \"image\",\n                            \"image\": image,\n                        }\n                        for image in self.observations_images\n                    ],\n                )\n            )\n\n        if self.observations is not None:\n            messages.append(\n                ChatMessage(\n                    role=MessageRole.TOOL_RESPONSE,\n                    content=[\n                        {\n                            \"type\": \"text\",\n                            \"text\": f\"Observation:\\n{self.observations}\",\n                        }\n                    ],\n                )\n            )\n        if self.error is not None:\n            error_message = (\n                \"Error:\\n\"\n                + str(self.error)\n                + \"\\nNow let's retry: take care not to repeat previous errors! If you have retried several times, try a completely different approach.\\n\"\n            )\n            message_content = f\"Call id: {self.tool_calls[0].id}\\n\" if self.tool_calls else \"\"\n            message_content += error_message\n            messages.append(\n                ChatMessage(role=MessageRole.TOOL_RESPONSE, content=[{\"type\": \"text\", \"text\": message_content}])\n            )\n\n        return messages\n\n\n@dataclass\nclass PlanningStep(MemoryStep):\n    model_input_messages: list[ChatMessage]\n    model_output_message: ChatMessage\n    plan: str\n    timing: Timing\n    token_usage: TokenUsage | None = None\n\n    def dict(self):\n        return {\n            \"model_input_messages\": [\n                make_json_serializable(get_dict_from_nested_dataclasses(msg)) for msg in self.model_input_messages\n            ],\n            \"model_output_message\": make_json_serializable(\n                get_dict_from_nested_dataclasses(self.model_output_message)\n            ),\n            \"plan\": self.plan,\n            \"timing\": self.timing.dict(),\n            \"token_usage\": asdict(self.token_usage) if self.token_usage else None,\n        }\n\n    def to_messages(self, summary_mode: bool = False) -> list[ChatMessage]:\n        if summary_mode:\n            return []\n        return [\n            ChatMessage(role=MessageRole.ASSISTANT, content=[{\"type\": \"text\", \"text\": self.plan.strip()}]),\n            ChatMessage(\n                role=MessageRole.USER, content=[{\"type\": \"text\", \"text\": \"Now proceed and carry out this plan.\"}]\n            ),\n            # This second message creates a role change to prevent models models from simply continuing the plan message\n        ]\n\n\n@dataclass\nclass TaskStep(MemoryStep):\n    task: str\n    task_images: list[\"PIL.Image.Image\"] | None = None\n\n    def to_messages(self, summary_mode: bool = False) -> list[ChatMessage]:\n        content = [{\"type\": \"text\", \"text\": f\"New task:\\n{self.task}\"}]\n        if self.task_images:\n            content.extend([{\"type\": \"image\", \"image\": image} for image in self.task_images])\n\n        return [ChatMessage(role=MessageRole.USER, content=content)]\n\n\n@dataclass\nclass SystemPromptStep(MemoryStep):\n    system_prompt: str\n\n    def to_messages(self, summary_mode: bool = False) -> list[ChatMessage]:\n        if summary_mode:\n            return []\n        return [ChatMessage(role=MessageRole.SYSTEM, content=[{\"type\": \"text\", \"text\": self.system_prompt}])]\n\n\n@dataclass\nclass FinalAnswerStep(MemoryStep):\n    output: Any\n\n\nclass AgentMemory:\n    \"\"\"Memory for the agent, containing the system prompt and all steps taken by the agent.\n\n    This class is used to store the agent's steps, including tasks, actions, and planning steps.\n    It allows for resetting the memory, retrieving succinct or full step information, and replaying the agent's steps.\n\n    Args:\n        system_prompt (`str`): System prompt for the agent, which sets the context and instructions for the agent's behavior.\n\n    **Attributes**:\n        - **system_prompt** (`SystemPromptStep`) -- System prompt step for the agent.\n        - **steps** (`list[TaskStep | ActionStep | PlanningStep]`) -- List of steps taken by the agent, which can include tasks, actions, and planning steps.\n    \"\"\"\n\n    def __init__(self, system_prompt: str):\n        self.system_prompt: SystemPromptStep = SystemPromptStep(system_prompt=system_prompt)\n        self.steps: list[TaskStep | ActionStep | PlanningStep] = []\n\n    def reset(self):\n        \"\"\"Reset the agent's memory, clearing all steps and keeping the system prompt.\"\"\"\n        self.steps = []\n\n    def get_succinct_steps(self) -> list[dict]:\n        \"\"\"Return a succinct representation of the agent's steps, excluding model input messages.\"\"\"\n        return [\n            {key: value for key, value in step.dict().items() if key != \"model_input_messages\"} for step in self.steps\n        ]\n\n    def get_full_steps(self) -> list[dict]:\n        \"\"\"Return a full representation of the agent's steps, including model input messages.\"\"\"\n        if len(self.steps) == 0:\n            return []\n        return [step.dict() for step in self.steps]\n\n    def replay(self, logger: AgentLogger, detailed: bool = False):\n        \"\"\"Prints a pretty replay of the agent's steps.\n\n        Args:\n            logger (`AgentLogger`): The logger to print replay logs to.\n            detailed (`bool`, default `False`): If True, also displays the memory at each step. Defaults to False.\n                Careful: will increase log length exponentially. Use only for debugging.\n        \"\"\"\n        logger.console.log(\"Replaying the agent's steps:\")\n        logger.log_markdown(title=\"System prompt\", content=self.system_prompt.system_prompt, level=LogLevel.ERROR)\n        for step in self.steps:\n            if isinstance(step, TaskStep):\n                logger.log_task(step.task, \"\", level=LogLevel.ERROR)\n            elif isinstance(step, ActionStep):\n                logger.log_rule(f\"Step {step.step_number}\", level=LogLevel.ERROR)\n                if detailed and step.model_input_messages is not None:\n                    logger.log_messages(step.model_input_messages, level=LogLevel.ERROR)\n                if step.model_output is not None:\n                    logger.log_markdown(title=\"Agent output:\", content=step.model_output, level=LogLevel.ERROR)\n            elif isinstance(step, PlanningStep):\n                logger.log_rule(\"Planning step\", level=LogLevel.ERROR)\n                if detailed and step.model_input_messages is not None:\n                    logger.log_messages(step.model_input_messages, level=LogLevel.ERROR)\n                logger.log_markdown(title=\"Agent output:\", content=step.plan, level=LogLevel.ERROR)\n\n    def return_full_code(self) -> str:\n        \"\"\"Returns all code actions from the agent's steps, concatenated as a single script.\"\"\"\n        return \"\\n\\n\".join(\n            [step.code_action for step in self.steps if isinstance(step, ActionStep) and step.code_action is not None]\n        )\n\n\nclass CallbackRegistry:\n    \"\"\"Registry for callbacks that are called at each step of the agent's execution.\n\n    Callbacks are registered by passing a step class and a callback function.\n    \"\"\"\n\n    def __init__(self):\n        self._callbacks: dict[Type[MemoryStep], list[Callable]] = {}\n\n    def register(self, step_cls: Type[MemoryStep], callback: Callable):\n        \"\"\"Register a callback for a step class.\n\n        Args:\n            step_cls (Type[MemoryStep]): Step class to register the callback for.\n            callback (Callable): Callback function to register.\n        \"\"\"\n        if step_cls not in self._callbacks:\n            self._callbacks[step_cls] = []\n        self._callbacks[step_cls].append(callback)\n\n    def callback(self, memory_step, **kwargs):\n        \"\"\"Call callbacks registered for a step type.\n\n        Args:\n            memory_step (MemoryStep): Step to call the callbacks for.\n            **kwargs: Additional arguments to pass to callbacks that accept them.\n                Typically, includes the agent instance.\n\n        Notes:\n            For backwards compatibility, callbacks with a single parameter signature\n            receive only the memory_step, while callbacks with multiple parameters\n            receive both the memory_step and any additional kwargs.\n        \"\"\"\n        # For compatibility with old callbacks that only take the step as an argument\n        for cls in memory_step.__class__.__mro__:\n            for cb in self._callbacks.get(cls, []):\n                cb(memory_step) if len(inspect.signature(cb).parameters) == 1 else cb(memory_step, **kwargs)\n"
  },
  {
    "path": "src/smolagents/models.py",
    "content": "# Copyright 2024 The HuggingFace Inc. team. All rights reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\nimport json\nimport logging\nimport os\nimport re\nimport uuid\nimport warnings\nfrom collections.abc import Generator\nfrom copy import deepcopy\nfrom dataclasses import asdict, dataclass\nfrom enum import Enum\nfrom threading import Thread\nfrom typing import TYPE_CHECKING, Any\n\nfrom .monitoring import TokenUsage\nfrom .tools import Tool\nfrom .utils import RateLimiter, Retrying, _is_package_available, encode_image_base64, make_image_url, parse_json_blob\n\n\nif TYPE_CHECKING:\n    from transformers import StoppingCriteriaList\n\n\nlogger = logging.getLogger(__name__)\n\nRETRY_WAIT = 60\nRETRY_MAX_ATTEMPTS = 3\nRETRY_EXPONENTIAL_BASE = 2\nRETRY_JITTER = True\nSTRUCTURED_GENERATION_PROVIDERS = [\"cerebras\", \"fireworks-ai\"]\nCODEAGENT_RESPONSE_FORMAT = {\n    \"type\": \"json_schema\",\n    \"json_schema\": {\n        \"schema\": {\n            \"additionalProperties\": False,\n            \"properties\": {\n                \"thought\": {\n                    \"description\": \"A free form text description of the thought process.\",\n                    \"title\": \"Thought\",\n                    \"type\": \"string\",\n                },\n                \"code\": {\n                    \"description\": \"Valid Python code snippet implementing the thought.\",\n                    \"title\": \"Code\",\n                    \"type\": \"string\",\n                },\n            },\n            \"required\": [\"thought\", \"code\"],\n            \"title\": \"ThoughtAndCodeAnswer\",\n            \"type\": \"object\",\n        },\n        \"name\": \"ThoughtAndCodeAnswer\",\n        \"strict\": True,\n    },\n}\n\n\ndef get_dict_from_nested_dataclasses(obj, ignore_key=None):\n    def convert(obj):\n        if hasattr(obj, \"__dataclass_fields__\"):\n            return {k: convert(v) for k, v in asdict(obj).items() if k != ignore_key}\n        return obj\n\n    return convert(obj)\n\n\ndef remove_content_after_stop_sequences(content: str | None, stop_sequences: list[str] | None) -> str | None:\n    \"\"\"Remove content after any stop sequence is encountered.\n\n    Some providers may return ``None`` content (for example when responding purely with tool calls),\n    so we skip processing in that case.\n    \"\"\"\n    if content is None or not stop_sequences:\n        return content\n\n    for stop_seq in stop_sequences:\n        split = content.split(stop_seq)\n        content = split[0]\n    return content\n\n\n@dataclass\nclass ChatMessageToolCallFunction:\n    arguments: Any\n    name: str\n    description: str | None = None\n\n\n@dataclass\nclass ChatMessageToolCall:\n    function: ChatMessageToolCallFunction\n    id: str\n    type: str\n\n    def __str__(self) -> str:\n        return f\"Call: {self.id}: Calling {str(self.function.name)} with arguments: {str(self.function.arguments)}\"\n\n\nclass MessageRole(str, Enum):\n    USER = \"user\"\n    ASSISTANT = \"assistant\"\n    SYSTEM = \"system\"\n    TOOL_CALL = \"tool-call\"\n    TOOL_RESPONSE = \"tool-response\"\n\n    @classmethod\n    def roles(cls):\n        return [r.value for r in cls]\n\n\n@dataclass\nclass ChatMessage:\n    role: MessageRole\n    content: str | list[dict[str, Any]] | None = None\n    tool_calls: list[ChatMessageToolCall] | None = None\n    raw: Any | None = None  # Stores the raw output from the API\n    token_usage: TokenUsage | None = None\n\n    def __post_init__(self) -> None:\n        if self.tool_calls is None:\n            return\n        self.tool_calls = [_coerce_tool_call(tool_call) for tool_call in self.tool_calls]\n\n    def model_dump_json(self):\n        return json.dumps(get_dict_from_nested_dataclasses(self, ignore_key=\"raw\"))\n\n    @classmethod\n    def from_dict(cls, data: dict, raw: Any | None = None, token_usage: TokenUsage | None = None) -> \"ChatMessage\":\n        if data.get(\"tool_calls\"):\n            tool_calls = [\n                ChatMessageToolCall(\n                    function=ChatMessageToolCallFunction(**tc[\"function\"]), id=tc[\"id\"], type=tc[\"type\"]\n                )\n                for tc in data[\"tool_calls\"]\n            ]\n            data[\"tool_calls\"] = tool_calls\n        return cls(\n            role=MessageRole(data[\"role\"]),\n            content=data.get(\"content\"),\n            tool_calls=data.get(\"tool_calls\"),\n            raw=raw,\n            token_usage=token_usage,\n        )\n\n    def dict(self):\n        return get_dict_from_nested_dataclasses(self)\n\n    def render_as_markdown(self) -> str:\n        rendered = str(self.content) or \"\"\n        if self.tool_calls:\n            rendered += \"\\n\".join(\n                [\n                    json.dumps({\"tool\": tool.function.name, \"arguments\": tool.function.arguments})\n                    for tool in self.tool_calls\n                ]\n            )\n        return rendered\n\n\ndef _coerce_tool_call(tool_call: Any) -> ChatMessageToolCall:\n    if isinstance(tool_call, ChatMessageToolCall):\n        return tool_call\n\n    if isinstance(tool_call, dict):\n        tool_call_dict = tool_call\n    elif hasattr(tool_call, \"model_dump\"):\n        tool_call_dict = tool_call.model_dump()\n    elif hasattr(tool_call, \"dict\") and callable(tool_call.dict):\n        tool_call_dict = tool_call.dict()\n\n    return ChatMessageToolCall(\n        function=ChatMessageToolCallFunction(\n            arguments=tool_call_dict[\"function\"][\"arguments\"],\n            name=tool_call_dict[\"function\"][\"name\"],\n        ),\n        id=tool_call_dict[\"id\"],\n        type=tool_call_dict[\"type\"],\n    )\n\n\ndef parse_json_if_needed(arguments: str | dict) -> str | dict:\n    if isinstance(arguments, dict):\n        return arguments\n    else:\n        try:\n            return json.loads(arguments)\n        except Exception:\n            return arguments\n\n\n@dataclass\nclass ChatMessageToolCallStreamDelta:\n    \"\"\"Represents a streaming delta for tool calls during generation.\"\"\"\n\n    index: int | None = None\n    id: str | None = None\n    type: str | None = None\n    function: ChatMessageToolCallFunction | None = None\n\n\n@dataclass\nclass ChatMessageStreamDelta:\n    content: str | None = None\n    tool_calls: list[ChatMessageToolCallStreamDelta] | None = None\n    token_usage: TokenUsage | None = None\n\n\ndef agglomerate_stream_deltas(\n    stream_deltas: list[ChatMessageStreamDelta], role: MessageRole = MessageRole.ASSISTANT\n) -> ChatMessage:\n    \"\"\"\n    Agglomerate a list of stream deltas into a single stream delta.\n    \"\"\"\n    accumulated_tool_calls: dict[int, ChatMessageToolCallStreamDelta] = {}\n    accumulated_content = \"\"\n    total_input_tokens = 0\n    total_output_tokens = 0\n    for stream_delta in stream_deltas:\n        if stream_delta.token_usage:\n            total_input_tokens += stream_delta.token_usage.input_tokens\n            total_output_tokens += stream_delta.token_usage.output_tokens\n        if stream_delta.content:\n            accumulated_content += stream_delta.content\n        if stream_delta.tool_calls:\n            for tool_call_delta in stream_delta.tool_calls:  # ?ormally there should be only one call at a time\n                # Extend accumulated_tool_calls list to accommodate the new tool call if needed\n                if tool_call_delta.index is not None:\n                    if tool_call_delta.index not in accumulated_tool_calls:\n                        accumulated_tool_calls[tool_call_delta.index] = ChatMessageToolCallStreamDelta(\n                            id=tool_call_delta.id,\n                            type=tool_call_delta.type,\n                            function=ChatMessageToolCallFunction(name=\"\", arguments=\"\"),\n                        )\n                    # Update the tool call at the specific index\n                    tool_call = accumulated_tool_calls[tool_call_delta.index]\n                    if tool_call_delta.id:\n                        tool_call.id = tool_call_delta.id\n                    if tool_call_delta.type:\n                        tool_call.type = tool_call_delta.type\n                    if tool_call_delta.function:\n                        if tool_call_delta.function.name and len(tool_call_delta.function.name) > 0:\n                            tool_call.function.name = tool_call_delta.function.name\n                        if tool_call_delta.function.arguments:\n                            tool_call.function.arguments += tool_call_delta.function.arguments\n                else:\n                    raise ValueError(f\"Tool call index is not provided in tool delta: {tool_call_delta}\")\n\n    return ChatMessage(\n        role=role,\n        content=accumulated_content,\n        tool_calls=[\n            ChatMessageToolCall(\n                function=ChatMessageToolCallFunction(\n                    name=tool_call_stream_delta.function.name,\n                    arguments=tool_call_stream_delta.function.arguments,\n                ),\n                id=tool_call_stream_delta.id or \"\",\n                type=\"function\",\n            )\n            for tool_call_stream_delta in accumulated_tool_calls.values()\n            if tool_call_stream_delta.function\n        ],\n        token_usage=TokenUsage(\n            input_tokens=total_input_tokens,\n            output_tokens=total_output_tokens,\n        ),\n    )\n\n\ntool_role_conversions = {\n    MessageRole.TOOL_CALL: MessageRole.ASSISTANT,\n    MessageRole.TOOL_RESPONSE: MessageRole.USER,\n}\n\n\ndef get_tool_json_schema(tool: Tool) -> dict:\n    properties = deepcopy(tool.inputs)\n    required = []\n    for key, value in properties.items():\n        if value[\"type\"] == \"any\":\n            value[\"type\"] = \"string\"\n        if not (\"nullable\" in value and value[\"nullable\"]):\n            required.append(key)\n\n        # parse anyOf\n        if \"anyOf\" in value:\n            types = []\n            enum = None\n            for t in value[\"anyOf\"]:\n                if t[\"type\"] == \"null\":\n                    value[\"nullable\"] = True\n                    continue\n                if t[\"type\"] == \"any\":\n                    types.append(\"string\")\n                else:\n                    types.append(t[\"type\"])\n                if \"enum\" in t:  # assuming there is only one enum in anyOf\n                    enum = t[\"enum\"]\n\n            value[\"type\"] = types if len(types) > 1 else types[0]\n            if enum is not None:\n                value[\"enum\"] = enum\n\n            value.pop(\"anyOf\")\n\n    return {\n        \"type\": \"function\",\n        \"function\": {\n            \"name\": tool.name,\n            \"description\": tool.description,\n            \"parameters\": {\n                \"type\": \"object\",\n                \"properties\": properties,\n                \"required\": required,\n            },\n        },\n    }\n\n\ndef get_clean_message_list(\n    message_list: list[ChatMessage | dict],\n    role_conversions: dict[MessageRole, MessageRole] | dict[str, str] = {},\n    convert_images_to_image_urls: bool = False,\n    flatten_messages_as_text: bool = False,\n) -> list[dict[str, Any]]:\n    \"\"\"\n    Creates a list of messages to give as input to the LLM. These messages are dictionaries and chat template compatible with transformers LLM chat template.\n    Subsequent messages with the same role will be concatenated to a single message.\n\n    Args:\n        message_list (`list[ChatMessage | dict]`): List of chat messages. Mixed types are allowed.\n        role_conversions (`dict[MessageRole, MessageRole]`, *optional* ): Mapping to convert roles.\n        convert_images_to_image_urls (`bool`, default `False`): Whether to convert images to image URLs.\n        flatten_messages_as_text (`bool`, default `False`): Whether to flatten messages as text.\n    \"\"\"\n    output_message_list: list[dict[str, Any]] = []\n    message_list = deepcopy(message_list)  # Avoid modifying the original list\n    for message in message_list:\n        if isinstance(message, dict):\n            message = ChatMessage.from_dict(message)\n        role = message.role\n        if role not in MessageRole.roles():\n            raise ValueError(f\"Incorrect role {role}, only {MessageRole.roles()} are supported for now.\")\n\n        if role in role_conversions:\n            message.role = role_conversions[role]  # type: ignore\n        # encode images if needed\n        if isinstance(message.content, list):\n            for element in message.content:\n                assert isinstance(element, dict), \"Error: this element should be a dict:\" + str(element)\n                if element[\"type\"] == \"image\":\n                    assert not flatten_messages_as_text, f\"Cannot use images with {flatten_messages_as_text=}\"\n                    if convert_images_to_image_urls:\n                        element.update(\n                            {\n                                \"type\": \"image_url\",\n                                \"image_url\": {\"url\": make_image_url(encode_image_base64(element.pop(\"image\")))},\n                            }\n                        )\n                    else:\n                        element[\"image\"] = encode_image_base64(element[\"image\"])\n\n        if len(output_message_list) > 0 and message.role == output_message_list[-1][\"role\"]:\n            assert isinstance(message.content, list), \"Error: wrong content:\" + str(message.content)\n            if flatten_messages_as_text:\n                output_message_list[-1][\"content\"] += \"\\n\" + message.content[0][\"text\"]\n            else:\n                for el in message.content:\n                    if el[\"type\"] == \"text\" and output_message_list[-1][\"content\"][-1][\"type\"] == \"text\":\n                        # Merge consecutive text messages rather than creating new ones\n                        output_message_list[-1][\"content\"][-1][\"text\"] += \"\\n\" + el[\"text\"]\n                    else:\n                        output_message_list[-1][\"content\"].append(el)\n        else:\n            if flatten_messages_as_text:\n                content = message.content[0][\"text\"]\n            else:\n                content = message.content\n            output_message_list.append(\n                {\n                    \"role\": message.role,\n                    \"content\": content,\n                }\n            )\n    return output_message_list\n\n\ndef get_tool_call_from_text(text: str, tool_name_key: str, tool_arguments_key: str) -> ChatMessageToolCall:\n    tool_call_dictionary, _ = parse_json_blob(text)\n    try:\n        tool_name = tool_call_dictionary[tool_name_key]\n    except Exception as e:\n        raise ValueError(\n            f\"Tool call needs to have a key '{tool_name_key}'. Got keys: {list(tool_call_dictionary.keys())} instead\"\n        ) from e\n    tool_arguments = tool_call_dictionary.get(tool_arguments_key, None)\n    if isinstance(tool_arguments, str):\n        tool_arguments = parse_json_if_needed(tool_arguments)\n    return ChatMessageToolCall(\n        id=str(uuid.uuid4()),\n        type=\"function\",\n        function=ChatMessageToolCallFunction(name=tool_name, arguments=tool_arguments),\n    )\n\n\ndef supports_stop_parameter(model_id: str) -> bool:\n    \"\"\"\n    Check if the model supports the `stop` parameter.\n\n    Not supported with reasoning models openai/o3, openai/o4-mini, and the openai/gpt-5 series (and their versioned variants).\n\n    Args:\n        model_id (`str`): Model identifier (e.g. \"openai/o3\", \"o4-mini-2025-04-16\")\n\n    Returns:\n        bool: True if the model supports the stop parameter, False otherwise\n    \"\"\"\n    model_name = model_id.split(\"/\")[-1]\n    if model_name == \"o3-mini\":\n        return True\n    # o3* (except mini), o4*, all grok-* models, and the gpt-5* family (including versioned variants) don't support stop parameter\n    openai_model_pattern = r\"(o3(?:$|[-.].*)|o4(?:$|[-.].*)|gpt-5.*)\"\n    grok_model_pattern = r\"([A-Za-z][A-Za-z0-9_-]*\\.)?grok-[A-Za-z0-9][A-Za-z0-9_.-]*\"\n    pattern = rf\"^({openai_model_pattern}|{grok_model_pattern})$\"\n\n    return not re.match(pattern, model_name)\n\n\nclass _ParameterRemove:\n    \"\"\"Sentinel value to indicate a parameter should be removed.\"\"\"\n\n    def __repr__(self):\n        return \"REMOVE_PARAMETER\"\n\n\n# Singleton instance for removing parameters\nREMOVE_PARAMETER = _ParameterRemove()\n\n\nclass Model:\n    \"\"\"Base class for all language model implementations.\n\n    This abstract class defines the core interface that all model implementations must follow\n    to work with agents. It provides common functionality for message handling, tool integration,\n    and model configuration while allowing subclasses to implement their specific generation logic.\n\n    Parameters:\n        flatten_messages_as_text (`bool`, default `False`):\n            Whether to flatten complex message content into plain text format.\n        tool_name_key (`str`, default `\"name\"`):\n            The key used to extract tool names from model responses.\n        tool_arguments_key (`str`, default `\"arguments\"`):\n            The key used to extract tool arguments from model responses.\n        model_id (`str`, *optional*):\n            Identifier for the specific model being used.\n        **kwargs:\n            Additional keyword arguments to forward to the underlying model completion call.\n\n    Note:\n        This is an abstract base class. Subclasses must implement the `generate()` method\n        to provide actual model inference capabilities.\n\n    Example:\n        ```python\n        class CustomModel(Model):\n            def generate(self, messages, **kwargs):\n                # Implementation specific to your model\n                pass\n        ```\n    \"\"\"\n\n    def __init__(\n        self,\n        flatten_messages_as_text: bool = False,\n        tool_name_key: str = \"name\",\n        tool_arguments_key: str = \"arguments\",\n        model_id: str | None = None,\n        **kwargs,\n    ):\n        self.flatten_messages_as_text = flatten_messages_as_text\n        self.tool_name_key = tool_name_key\n        self.tool_arguments_key = tool_arguments_key\n        self.kwargs = kwargs\n        self.model_id: str | None = model_id\n\n    @property\n    def supports_stop_parameter(self) -> bool:\n        return supports_stop_parameter(self.model_id or \"\")\n\n    def _prepare_completion_kwargs(\n        self,\n        messages: list[ChatMessage | dict],\n        stop_sequences: list[str] | None = None,\n        response_format: dict[str, str] | None = None,\n        tools_to_call_from: list[Tool] | None = None,\n        custom_role_conversions: dict[str, str] | None = None,\n        convert_images_to_image_urls: bool = False,\n        tool_choice: str | dict | None = \"required\",  # Configurable tool_choice parameter\n        **kwargs,\n    ) -> dict[str, Any]:\n        \"\"\"\n        Prepare parameters required for model invocation.\n\n        Parameter priority (highest to lowest):\n        1. self.kwargs (model defaults)\n        2. Explicitly passed kwargs\n        3. Specific parameters (stop_sequences, response_format, etc.)\n        \"\"\"\n        # Clean and standardize the message list\n        flatten_messages_as_text = kwargs.pop(\"flatten_messages_as_text\", self.flatten_messages_as_text)\n        messages_as_dicts = get_clean_message_list(\n            messages,\n            role_conversions=custom_role_conversions or tool_role_conversions,\n            convert_images_to_image_urls=convert_images_to_image_urls,\n            flatten_messages_as_text=flatten_messages_as_text,\n        )\n        # Start with messages\n        completion_kwargs = {\n            \"messages\": messages_as_dicts,\n        }\n        # Override with specific parameters\n        if stop_sequences is not None and self.supports_stop_parameter:\n            # Some models do not support stop parameter\n            completion_kwargs[\"stop\"] = stop_sequences\n        if response_format is not None:\n            completion_kwargs[\"response_format\"] = response_format\n        if tools_to_call_from:\n            completion_kwargs[\"tools\"] = [get_tool_json_schema(tool) for tool in tools_to_call_from]\n            if tool_choice is not None:\n                completion_kwargs[\"tool_choice\"] = tool_choice\n        # Override with passed-in kwargs\n        completion_kwargs.update(kwargs)\n        # Override with self.kwargs\n        for kwarg_name, kwarg_value in self.kwargs.items():\n            if kwarg_value is REMOVE_PARAMETER:\n                completion_kwargs.pop(kwarg_name, None)  # Remove parameter if present\n            else:\n                completion_kwargs[kwarg_name] = kwarg_value  # Set/override parameter\n        return completion_kwargs\n\n    def generate(\n        self,\n        messages: list[ChatMessage],\n        stop_sequences: list[str] | None = None,\n        response_format: dict[str, str] | None = None,\n        tools_to_call_from: list[Tool] | None = None,\n        **kwargs,\n    ) -> ChatMessage:\n        \"\"\"Process the input messages and return the model's response.\n\n        Parameters:\n            messages (`list[dict[str, str | list[dict]]] | list[ChatMessage]`):\n                A list of message dictionaries to be processed. Each dictionary should have the structure `{\"role\": \"user/system\", \"content\": \"message content\"}`.\n            stop_sequences (`List[str]`, *optional*):\n                A list of strings that will stop the generation if encountered in the model's output.\n            response_format (`dict[str, str]`, *optional*):\n                The response format to use in the model's response.\n            tools_to_call_from (`List[Tool]`, *optional*):\n                A list of tools that the model can use to generate responses.\n            **kwargs:\n                Additional keyword arguments to be passed to the underlying model.\n\n        Returns:\n            `ChatMessage`: A chat message object containing the model's response.\n        \"\"\"\n        raise NotImplementedError(\"This method must be implemented in child classes\")\n\n    def __call__(self, *args, **kwargs):\n        return self.generate(*args, **kwargs)\n\n    def parse_tool_calls(self, message: ChatMessage) -> ChatMessage:\n        \"\"\"Sometimes APIs do not return the tool call as a specific object, so we need to parse it.\"\"\"\n        message.role = MessageRole.ASSISTANT  # Overwrite role if needed\n        if not message.tool_calls:\n            assert message.content is not None, \"Message contains no content and no tool calls\"\n            message.tool_calls = [\n                get_tool_call_from_text(message.content, self.tool_name_key, self.tool_arguments_key)\n            ]\n        assert len(message.tool_calls) > 0, \"No tool call was found in the model output\"\n        for tool_call in message.tool_calls:\n            tool_call.function.arguments = parse_json_if_needed(tool_call.function.arguments)\n        return message\n\n    def to_dict(self) -> dict:\n        \"\"\"\n        Converts the model into a JSON-compatible dictionary.\n        \"\"\"\n        model_dictionary = {\n            **self.kwargs,\n            \"model_id\": self.model_id,\n        }\n        for attribute in [\n            \"custom_role_conversion\",\n            \"temperature\",\n            \"max_tokens\",\n            \"provider\",\n            \"timeout\",\n            \"api_base\",\n            \"torch_dtype\",\n            \"device_map\",\n            \"organization\",\n            \"project\",\n            \"azure_endpoint\",\n        ]:\n            if hasattr(self, attribute):\n                model_dictionary[attribute] = getattr(self, attribute)\n\n        dangerous_attributes = [\"token\", \"api_key\"]\n        for attribute_name in dangerous_attributes:\n            if hasattr(self, attribute_name):\n                print(\n                    f\"For security reasons, we do not export the `{attribute_name}` attribute of your model. Please export it manually.\"\n                )\n        return model_dictionary\n\n    @classmethod\n    def from_dict(cls, model_dictionary: dict[str, Any]) -> \"Model\":\n        return cls(**{k: v for k, v in model_dictionary.items()})\n\n\nclass VLLMModel(Model):\n    \"\"\"Model to use [vLLM](https://docs.vllm.ai/) for fast LLM inference and serving.\n\n    Parameters:\n        model_id (`str`):\n            The Hugging Face model ID to be used for inference.\n            This can be a path or model identifier from the Hugging Face model hub.\n        model_kwargs (`dict[str, Any]`, *optional*):\n            Additional keyword arguments to forward to the vLLM LLM instantiation, such as `revision`, `max_model_len`, etc.\n        apply_chat_template_kwargs (dict, *optional*):\n            Additional keyword arguments to pass to the `apply_chat_template` method of the tokenizer.\n        **kwargs:\n            Additional keyword arguments to forward to the underlying vLLM model generate call.\n    \"\"\"\n\n    def __init__(\n        self,\n        model_id,\n        model_kwargs: dict[str, Any] | None = None,\n        apply_chat_template_kwargs: dict[str, Any] | None = None,\n        **kwargs,\n    ):\n        if not _is_package_available(\"vllm\"):\n            raise ModuleNotFoundError(\"Please install 'vllm' extra to use VLLMModel: `pip install 'smolagents[vllm]'`\")\n\n        from vllm import LLM  # type: ignore\n        from vllm.transformers_utils.tokenizer import get_tokenizer  # type: ignore\n\n        self.model_kwargs = model_kwargs or {}\n        self.apply_chat_template_kwargs = apply_chat_template_kwargs or {}\n        super().__init__(**kwargs)\n        self.model_id = model_id\n        self.model = LLM(model=model_id, **self.model_kwargs)\n        assert self.model is not None\n        self.tokenizer = get_tokenizer(model_id)\n        self._is_vlm = False  # VLLMModel does not support vision models yet.\n\n    def cleanup(self):\n        import gc\n\n        import torch\n        from vllm.distributed.parallel_state import (  # type: ignore\n            destroy_distributed_environment,\n            destroy_model_parallel,\n        )\n\n        destroy_model_parallel()\n        if self.model is not None:\n            # taken from https://github.com/vllm-project/vllm/issues/1908#issuecomment-2076870351\n            del self.model.llm_engine.model_executor.driver_worker\n        gc.collect()\n        destroy_distributed_environment()\n        torch.cuda.empty_cache()\n\n    def generate(\n        self,\n        messages: list[ChatMessage | dict],\n        stop_sequences: list[str] | None = None,\n        response_format: dict[str, str] | None = None,\n        tools_to_call_from: list[Tool] | None = None,\n        **kwargs,\n    ) -> ChatMessage:\n        from vllm import SamplingParams  # type: ignore\n        from vllm.sampling_params import StructuredOutputsParams  # type: ignore\n\n        completion_kwargs = self._prepare_completion_kwargs(\n            messages=messages,\n            flatten_messages_as_text=(not self._is_vlm),\n            stop_sequences=stop_sequences,\n            tools_to_call_from=tools_to_call_from,\n            **kwargs,\n        )\n        # Override the OpenAI schema for VLLM compatibility\n        structured_outputs = (\n            StructuredOutputsParams(json=response_format[\"json_schema\"][\"schema\"]) if response_format else None\n        )\n\n        messages = completion_kwargs.pop(\"messages\")\n        prepared_stop_sequences = completion_kwargs.pop(\"stop\", [])\n        tools = completion_kwargs.pop(\"tools\", None)\n        completion_kwargs.pop(\"tool_choice\", None)\n\n        prompt = self.tokenizer.apply_chat_template(\n            messages,\n            tools=tools,\n            add_generation_prompt=True,\n            tokenize=False,\n            **self.apply_chat_template_kwargs,\n        )\n\n        sampling_params = SamplingParams(\n            n=kwargs.get(\"n\", 1),\n            temperature=kwargs.get(\"temperature\", 0.0),\n            max_tokens=kwargs.get(\"max_tokens\", 2048),\n            stop=prepared_stop_sequences,\n            structured_outputs=structured_outputs,\n        )\n\n        out = self.model.generate(\n            prompt,\n            sampling_params=sampling_params,\n            **completion_kwargs,\n        )\n\n        output_text = out[0].outputs[0].text\n        if stop_sequences is not None and not self.supports_stop_parameter:\n            output_text = remove_content_after_stop_sequences(output_text, stop_sequences)\n        return ChatMessage(\n            role=MessageRole.ASSISTANT,\n            content=output_text,\n            raw={\"out\": output_text, \"completion_kwargs\": completion_kwargs},\n            token_usage=TokenUsage(\n                input_tokens=len(out[0].prompt_token_ids),\n                output_tokens=len(out[0].outputs[0].token_ids),\n            ),\n        )\n\n\nclass MLXModel(Model):\n    \"\"\"A class to interact with models loaded using MLX on Apple silicon.\n\n    > [!TIP]\n    > You must have `mlx-lm` installed on your machine. Please run `pip install 'smolagents[mlx-lm]'` if it's not the case.\n\n    Parameters:\n        model_id (str):\n            The Hugging Face model ID to be used for inference. This can be a path or model identifier from the Hugging Face model hub.\n        tool_name_key (str):\n            The key, which can usually be found in the model's chat template, for retrieving a tool name.\n        tool_arguments_key (str):\n            The key, which can usually be found in the model's chat template, for retrieving tool arguments.\n        trust_remote_code (bool, default `False`):\n            Some models on the Hub require running remote code: for this model, you would have to set this flag to True.\n        load_kwargs (dict[str, Any], *optional*):\n            Additional keyword arguments to pass to the `mlx.lm.load` method when loading the model and tokenizer.\n        apply_chat_template_kwargs (dict, *optional*):\n            Additional keyword arguments to pass to the `apply_chat_template` method of the tokenizer.\n        **kwargs:\n            Additional keyword arguments to forward to the underlying MLX model stream_generate call, for instance `max_tokens`.\n\n    Example:\n    ```python\n    >>> engine = MLXModel(\n    ...     model_id=\"mlx-community/Qwen2.5-Coder-32B-Instruct-4bit\",\n    ...     max_tokens=10000,\n    ... )\n    >>> messages = [\n    ...     {\n    ...         \"role\": \"user\",\n    ...         \"content\": \"Explain quantum mechanics in simple terms.\"\n    ...     }\n    ... ]\n    >>> response = engine(messages, stop_sequences=[\"END\"])\n    >>> print(response)\n    \"Quantum mechanics is the branch of physics that studies...\"\n    ```\n    \"\"\"\n\n    def __init__(\n        self,\n        model_id: str,\n        trust_remote_code: bool = False,\n        load_kwargs: dict[str, Any] | None = None,\n        apply_chat_template_kwargs: dict[str, Any] | None = None,\n        **kwargs,\n    ):\n        if not _is_package_available(\"mlx_lm\"):\n            raise ModuleNotFoundError(\n                \"Please install 'mlx-lm' extra to use 'MLXModel': `pip install 'smolagents[mlx-lm]'`\"\n            )\n        import mlx_lm\n\n        self.load_kwargs = load_kwargs or {}\n        self.load_kwargs.setdefault(\"tokenizer_config\", {}).setdefault(\"trust_remote_code\", trust_remote_code)\n        self.apply_chat_template_kwargs = apply_chat_template_kwargs or {}\n        self.apply_chat_template_kwargs.setdefault(\"add_generation_prompt\", True)\n        # mlx-lm doesn't support vision models: flatten_messages_as_text=True\n        super().__init__(model_id=model_id, flatten_messages_as_text=True, **kwargs)\n\n        self.model, self.tokenizer = mlx_lm.load(self.model_id, **self.load_kwargs)\n        self.stream_generate = mlx_lm.stream_generate\n        self.is_vlm = False  # mlx-lm doesn't support vision models\n\n    def generate(\n        self,\n        messages: list[ChatMessage | dict],\n        stop_sequences: list[str] | None = None,\n        response_format: dict[str, str] | None = None,\n        tools_to_call_from: list[Tool] | None = None,\n        **kwargs,\n    ) -> ChatMessage:\n        if response_format is not None:\n            raise ValueError(\"MLX does not support structured outputs.\")\n        completion_kwargs = self._prepare_completion_kwargs(\n            messages=messages,\n            stop_sequences=stop_sequences,\n            tools_to_call_from=tools_to_call_from,\n            **kwargs,\n        )\n        messages = completion_kwargs.pop(\"messages\")\n        stops = completion_kwargs.pop(\"stop\", [])\n        tools = completion_kwargs.pop(\"tools\", None)\n        completion_kwargs.pop(\"tool_choice\", None)\n\n        prompt_ids = self.tokenizer.apply_chat_template(messages, tools=tools, **self.apply_chat_template_kwargs)\n\n        output_tokens = 0\n        text = \"\"\n        for response in self.stream_generate(self.model, self.tokenizer, prompt=prompt_ids, **completion_kwargs):\n            output_tokens += 1\n            text += response.text\n            if any((stop_index := text.rfind(stop)) != -1 for stop in stops):\n                text = text[:stop_index]\n                break\n        if stop_sequences is not None and not self.supports_stop_parameter:\n            text = remove_content_after_stop_sequences(text, stop_sequences)\n        return ChatMessage(\n            role=MessageRole.ASSISTANT,\n            content=text,\n            raw={\"out\": text, \"completion_kwargs\": completion_kwargs},\n            token_usage=TokenUsage(\n                input_tokens=len(prompt_ids),\n                output_tokens=output_tokens,\n            ),\n        )\n\n\nclass TransformersModel(Model):\n    \"\"\"A class that uses Hugging Face's Transformers library for language model interaction.\n\n    This model allows you to load and use Hugging Face's models locally using the Transformers library. It supports features like stop sequences and grammar customization.\n\n    > [!TIP]\n    > You must have `transformers` and `torch` installed on your machine. Please run `pip install 'smolagents[transformers]'` if it's not the case.\n\n    Parameters:\n        model_id (`str`):\n            The Hugging Face model ID to be used for inference. This can be a path or model identifier from the Hugging Face model hub.\n            For example, `\"Qwen/Qwen3-Next-80B-A3B-Thinking\"`.\n        device_map (`str`, *optional*):\n            The device_map to initialize your model with.\n        torch_dtype (`str`, *optional*):\n            The torch_dtype to initialize your model with.\n        trust_remote_code (bool, default `False`):\n            Some models on the Hub require running remote code: for this model, you would have to set this flag to True.\n        model_kwargs (`dict[str, Any]`, *optional*):\n            Additional keyword arguments to pass to `AutoModel.from_pretrained` (like revision, model_args, config, etc.).\n        max_new_tokens (`int`, default `4096`):\n            Maximum number of new tokens to generate, ignoring the number of tokens in the prompt.\n        max_tokens (`int`, *optional*):\n            Alias for `max_new_tokens`. If provided, this value takes precedence.\n        apply_chat_template_kwargs (dict, *optional*):\n            Additional keyword arguments to pass to the `apply_chat_template` method of the tokenizer.\n        **kwargs:\n            Additional keyword arguments to forward to the underlying Transformers model generate call, such as `device`.\n    Raises:\n        ValueError:\n            If the model name is not provided.\n\n    Example:\n    ```python\n    >>> engine = TransformersModel(\n    ...     model_id=\"Qwen/Qwen3-Next-80B-A3B-Thinking\",\n    ...     device=\"cuda\",\n    ...     max_new_tokens=5000,\n    ... )\n    >>> messages = [{\"role\": \"user\", \"content\": \"Explain quantum mechanics in simple terms.\"}]\n    >>> response = engine(messages, stop_sequences=[\"END\"])\n    >>> print(response)\n    \"Quantum mechanics is the branch of physics that studies...\"\n    ```\n    \"\"\"\n\n    def __init__(\n        self,\n        model_id: str | None = None,\n        device_map: str | None = None,\n        torch_dtype: str | None = None,\n        trust_remote_code: bool = False,\n        model_kwargs: dict[str, Any] | None = None,\n        max_new_tokens: int = 4096,\n        max_tokens: int | None = None,\n        apply_chat_template_kwargs: dict[str, Any] | None = None,\n        **kwargs,\n    ):\n        try:\n            import torch\n            from transformers import (\n                AutoModelForCausalLM,\n                AutoModelForImageTextToText,\n                AutoProcessor,\n                AutoTokenizer,\n                TextIteratorStreamer,\n            )\n        except ModuleNotFoundError:\n            raise ModuleNotFoundError(\n                \"Please install 'transformers' extra to use 'TransformersModel': `pip install 'smolagents[transformers]'`\"\n            )\n\n        if not model_id:\n            warnings.warn(\n                \"The 'model_id' parameter will be required in version 2.0.0. \"\n                \"Please update your code to pass this parameter to avoid future errors. \"\n                \"For now, it defaults to 'HuggingFaceTB/SmolLM2-1.7B-Instruct'.\",\n                FutureWarning,\n            )\n            model_id = \"HuggingFaceTB/SmolLM2-1.7B-Instruct\"\n\n        max_new_tokens = max_tokens if max_tokens is not None else max_new_tokens\n\n        if device_map is None:\n            device_map = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n        logger.info(f\"Using device: {device_map}\")\n        self._is_vlm = False\n        self.model_kwargs = model_kwargs or {}\n        self.apply_chat_template_kwargs = apply_chat_template_kwargs or {}\n        try:\n            self.model = AutoModelForImageTextToText.from_pretrained(\n                model_id,\n                device_map=device_map,\n                torch_dtype=torch_dtype,\n                trust_remote_code=trust_remote_code,\n                **self.model_kwargs,\n            )\n            self.processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=trust_remote_code)\n            self._is_vlm = True\n            self.streamer = TextIteratorStreamer(self.processor.tokenizer, skip_prompt=True, skip_special_tokens=True)  # type: ignore\n\n        except ValueError as e:\n            if \"Unrecognized configuration class\" in str(e):\n                self.model = AutoModelForCausalLM.from_pretrained(\n                    model_id,\n                    device_map=device_map,\n                    torch_dtype=torch_dtype,\n                    trust_remote_code=trust_remote_code,\n                    **self.model_kwargs,\n                )\n                self.tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=trust_remote_code)\n                self.streamer = TextIteratorStreamer(self.tokenizer, skip_prompt=True, skip_special_tokens=True)  # type: ignore\n            else:\n                raise e\n        except Exception as e:\n            raise ValueError(f\"Failed to load tokenizer and model for {model_id=}: {e}\") from e\n        super().__init__(\n            flatten_messages_as_text=not self._is_vlm, model_id=model_id, max_new_tokens=max_new_tokens, **kwargs\n        )\n\n    def make_stopping_criteria(self, stop_sequences: list[str], tokenizer) -> \"StoppingCriteriaList\":\n        from transformers import StoppingCriteria, StoppingCriteriaList\n\n        class StopOnStrings(StoppingCriteria):\n            def __init__(self, stop_strings: list[str], tokenizer):\n                self.stop_strings = stop_strings\n                self.tokenizer = tokenizer\n                self.stream = \"\"\n\n            def reset(self):\n                self.stream = \"\"\n\n            def __call__(self, input_ids, scores, **kwargs):\n                generated = self.tokenizer.decode(input_ids[0][-1], skip_special_tokens=True)\n                self.stream += generated\n                if any([self.stream.endswith(stop_string) for stop_string in self.stop_strings]):\n                    return True\n                return False\n\n        return StoppingCriteriaList([StopOnStrings(stop_sequences, tokenizer)])\n\n    def _prepare_completion_args(\n        self,\n        messages: list[ChatMessage | dict],\n        stop_sequences: list[str] | None = None,\n        tools_to_call_from: list[Tool] | None = None,\n        **kwargs,\n    ) -> dict[str, Any]:\n        completion_kwargs = self._prepare_completion_kwargs(\n            messages=messages,\n            stop_sequences=stop_sequences,\n            tools_to_call_from=tools_to_call_from,\n            tool_choice=None,\n            **kwargs,\n        )\n\n        messages = completion_kwargs.pop(\"messages\")\n        stop_sequences = completion_kwargs.pop(\"stop\", None)\n        tools = completion_kwargs.pop(\"tools\", None)\n\n        max_new_tokens = (\n            kwargs.get(\"max_new_tokens\")\n            or kwargs.get(\"max_tokens\")\n            or self.kwargs.get(\"max_new_tokens\")\n            or self.kwargs.get(\"max_tokens\")\n            or 1024\n        )\n        prompt_tensor = (self.processor if hasattr(self, \"processor\") else self.tokenizer).apply_chat_template(\n            messages,\n            tools=tools,\n            return_tensors=\"pt\",\n            add_generation_prompt=True,\n            tokenize=True,\n            return_dict=True,\n            **self.apply_chat_template_kwargs,\n        )\n        prompt_tensor = prompt_tensor.to(self.model.device)  # type: ignore\n        if hasattr(prompt_tensor, \"input_ids\"):\n            prompt_tensor = prompt_tensor[\"input_ids\"]\n\n        model_tokenizer = self.processor.tokenizer if hasattr(self, \"processor\") else self.tokenizer\n        stopping_criteria = (\n            self.make_stopping_criteria(stop_sequences, tokenizer=model_tokenizer) if stop_sequences else None\n        )\n        completion_kwargs[\"max_new_tokens\"] = max_new_tokens\n        return dict(\n            inputs=prompt_tensor,\n            use_cache=True,\n            stopping_criteria=stopping_criteria,\n            **completion_kwargs,\n        )\n\n    def generate(\n        self,\n        messages: list[ChatMessage | dict],\n        stop_sequences: list[str] | None = None,\n        response_format: dict[str, str] | None = None,\n        tools_to_call_from: list[Tool] | None = None,\n        **kwargs,\n    ) -> ChatMessage:\n        if response_format is not None:\n            raise ValueError(\"Transformers does not support structured outputs, use VLLMModel for this.\")\n        generation_kwargs = self._prepare_completion_args(\n            messages=messages,\n            stop_sequences=stop_sequences,\n            tools_to_call_from=tools_to_call_from,\n            **kwargs,\n        )\n        count_prompt_tokens = generation_kwargs[\"inputs\"].shape[1]  # type: ignore\n        out = self.model.generate(\n            **generation_kwargs,\n        )\n        generated_tokens = out[0, count_prompt_tokens:]\n        if hasattr(self, \"processor\"):\n            output_text = self.processor.decode(generated_tokens, skip_special_tokens=True)\n        else:\n            output_text = self.tokenizer.decode(generated_tokens, skip_special_tokens=True)\n\n        if stop_sequences is not None:\n            output_text = remove_content_after_stop_sequences(output_text, stop_sequences)\n        return ChatMessage(\n            role=MessageRole.ASSISTANT,\n            content=output_text,\n            raw={\n                \"out\": output_text,\n                \"completion_kwargs\": {key: value for key, value in generation_kwargs.items() if key != \"inputs\"},\n            },\n            token_usage=TokenUsage(\n                input_tokens=count_prompt_tokens,\n                output_tokens=len(generated_tokens),\n            ),\n        )\n\n    def generate_stream(\n        self,\n        messages: list[ChatMessage | dict],\n        stop_sequences: list[str] | None = None,\n        response_format: dict[str, str] | None = None,\n        tools_to_call_from: list[Tool] | None = None,\n        **kwargs,\n    ) -> Generator[ChatMessageStreamDelta]:\n        if response_format is not None:\n            raise ValueError(\"Transformers does not support structured outputs, use VLLMModel for this.\")\n        generation_kwargs = self._prepare_completion_args(\n            messages=messages,\n            stop_sequences=stop_sequences,\n            response_format=response_format,\n            tools_to_call_from=tools_to_call_from,\n            **kwargs,\n        )\n\n        # Get prompt token count once\n        count_prompt_tokens = generation_kwargs[\"inputs\"].shape[1]  # type: ignore\n\n        # Start generation in a separate thread\n        thread = Thread(target=self.model.generate, kwargs={\"streamer\": self.streamer, **generation_kwargs})\n        thread.start()\n\n        # Process streaming output\n        is_first_token = True\n        count_generated_tokens = 0\n        for new_text in self.streamer:\n            count_generated_tokens += 1\n            # Only include input tokens in the first yielded token\n            input_tokens = count_prompt_tokens if is_first_token else 0\n            is_first_token = False\n            yield ChatMessageStreamDelta(\n                content=new_text,\n                tool_calls=None,\n                token_usage=TokenUsage(input_tokens=input_tokens, output_tokens=1),\n            )\n            count_prompt_tokens = 0\n        thread.join()\n\n        # Update final output token count\n        self._last_output_token_count = count_generated_tokens\n\n\nclass ApiModel(Model):\n    \"\"\"\n    Base class for API-based language models.\n\n    This class serves as a foundation for implementing models that interact with\n    external APIs. It handles the common functionality for managing model IDs,\n    custom role mappings, and API client connections.\n\n    Parameters:\n        model_id (`str`):\n            The identifier for the model to be used with the API.\n        custom_role_conversions (`dict[str, str`], **optional**):\n            Mapping to convert  between internal role names and API-specific role names. Defaults to None.\n        client (`Any`, **optional**):\n            Pre-configured API client instance. If not provided, a default client will be created. Defaults to None.\n        requests_per_minute (`float`, **optional**):\n            Rate limit in requests per minute.\n        retry (`bool`, **optional**):\n            Wether to retry on rate limit errors, up to RETRY_MAX_ATTEMPTS times. Defaults to True.\n        **kwargs:\n            Additional keyword arguments to forward to the underlying model completion call.\n    \"\"\"\n\n    def __init__(\n        self,\n        model_id: str,\n        custom_role_conversions: dict[str, str] | None = None,\n        client: Any | None = None,\n        requests_per_minute: float | None = None,\n        retry: bool = True,\n        **kwargs,\n    ):\n        super().__init__(model_id=model_id, **kwargs)\n        self.custom_role_conversions = custom_role_conversions or {}\n        self.client = client or self.create_client()\n        self.rate_limiter = RateLimiter(requests_per_minute)\n        self.retryer = Retrying(\n            max_attempts=RETRY_MAX_ATTEMPTS if retry else 1,\n            wait_seconds=RETRY_WAIT,\n            exponential_base=RETRY_EXPONENTIAL_BASE,\n            jitter=RETRY_JITTER,\n            retry_predicate=is_rate_limit_error,\n            reraise=True,\n            before_sleep_logger=(logger, logging.INFO),\n            after_logger=(logger, logging.INFO),\n        )\n\n    def create_client(self):\n        \"\"\"Create the API client for the specific service.\"\"\"\n        raise NotImplementedError(\"Subclasses must implement this method to create a client\")\n\n    def _apply_rate_limit(self):\n        \"\"\"Apply rate limiting before making API calls.\"\"\"\n        self.rate_limiter.throttle()\n\n\ndef is_rate_limit_error(exception: BaseException) -> bool:\n    \"\"\"Check if the exception is a rate limit error.\"\"\"\n    error_str = str(exception).lower()\n    return (\n        \"429\" in error_str\n        or \"rate limit\" in error_str\n        or \"too many requests\" in error_str\n        or \"rate_limit\" in error_str\n    )\n\n\nclass LiteLLMModel(ApiModel):\n    \"\"\"Model to use [LiteLLM Python SDK](https://docs.litellm.ai/docs/#litellm-python-sdk) to access hundreds of LLMs.\n\n    Parameters:\n        model_id (`str`):\n            The model identifier to use on the server (e.g. \"gpt-3.5-turbo\").\n        api_base (`str`, *optional*):\n            The base URL of the provider API to call the model.\n        api_key (`str`, *optional*):\n            The API key to use for authentication.\n        custom_role_conversions (`dict[str, str]`, *optional*):\n            Custom role conversion mapping to convert message roles in others.\n            Useful for specific models that do not support specific message roles like \"system\".\n        flatten_messages_as_text (`bool`, *optional*): Whether to flatten messages as text.\n            Defaults to `True` for models that start with \"ollama\", \"groq\", \"cerebras\".\n        **kwargs:\n            Additional keyword arguments to forward to the underlying LiteLLM completion call.\n    \"\"\"\n\n    def __init__(\n        self,\n        model_id: str | None = None,\n        api_base: str | None = None,\n        api_key: str | None = None,\n        custom_role_conversions: dict[str, str] | None = None,\n        flatten_messages_as_text: bool | None = None,\n        **kwargs,\n    ):\n        if not model_id:\n            warnings.warn(\n                \"The 'model_id' parameter will be required in version 2.0.0. \"\n                \"Please update your code to pass this parameter to avoid future errors. \"\n                \"For now, it defaults to 'anthropic/claude-3-5-sonnet-20240620'.\",\n                FutureWarning,\n            )\n            model_id = \"anthropic/claude-3-5-sonnet-20240620\"\n        self.api_base = api_base\n        self.api_key = api_key\n        flatten_messages_as_text = (\n            flatten_messages_as_text\n            if flatten_messages_as_text is not None\n            else model_id.startswith((\"ollama\", \"groq\", \"cerebras\"))\n        )\n        super().__init__(\n            model_id=model_id,\n            custom_role_conversions=custom_role_conversions,\n            flatten_messages_as_text=flatten_messages_as_text,\n            **kwargs,\n        )\n\n    def create_client(self):\n        \"\"\"Create the LiteLLM client.\"\"\"\n        try:\n            import litellm\n        except ModuleNotFoundError as e:\n            raise ModuleNotFoundError(\n                \"Please install 'litellm' extra to use LiteLLMModel: `pip install 'smolagents[litellm]'`\"\n            ) from e\n\n        return litellm\n\n    def generate(\n        self,\n        messages: list[ChatMessage | dict],\n        stop_sequences: list[str] | None = None,\n        response_format: dict[str, str] | None = None,\n        tools_to_call_from: list[Tool] | None = None,\n        **kwargs,\n    ) -> ChatMessage:\n        completion_kwargs = self._prepare_completion_kwargs(\n            messages=messages,\n            stop_sequences=stop_sequences,\n            response_format=response_format,\n            tools_to_call_from=tools_to_call_from,\n            model=self.model_id,\n            api_base=self.api_base,\n            api_key=self.api_key,\n            convert_images_to_image_urls=True,\n            custom_role_conversions=self.custom_role_conversions,\n            **kwargs,\n        )\n        self._apply_rate_limit()\n        response = self.retryer(self.client.completion, **completion_kwargs)\n\n        if not response.choices:\n            raise RuntimeError(\n                f\"Unexpected API response: model '{self.model_id}' returned no choices. \"\n                \" This may indicate a possible API or upstream issue. \"\n                f\"Response details: {response.model_dump()}\"\n            )\n        content = response.choices[0].message.content\n        if stop_sequences is not None and not self.supports_stop_parameter:\n            content = remove_content_after_stop_sequences(content, stop_sequences)\n        return ChatMessage(\n            role=response.choices[0].message.role,\n            content=content,\n            tool_calls=response.choices[0].message.tool_calls,\n            raw=response,\n            token_usage=TokenUsage(\n                input_tokens=response.usage.prompt_tokens,\n                output_tokens=response.usage.completion_tokens,\n            ),\n        )\n\n    def generate_stream(\n        self,\n        messages: list[ChatMessage | dict],\n        stop_sequences: list[str] | None = None,\n        response_format: dict[str, str] | None = None,\n        tools_to_call_from: list[Tool] | None = None,\n        **kwargs,\n    ) -> Generator[ChatMessageStreamDelta]:\n        completion_kwargs = self._prepare_completion_kwargs(\n            messages=messages,\n            stop_sequences=stop_sequences,\n            response_format=response_format,\n            tools_to_call_from=tools_to_call_from,\n            model=self.model_id,\n            api_base=self.api_base,\n            api_key=self.api_key,\n            custom_role_conversions=self.custom_role_conversions,\n            convert_images_to_image_urls=True,\n            **kwargs,\n        )\n        self._apply_rate_limit()\n        for event in self.retryer(\n            self.client.completion, **completion_kwargs, stream=True, stream_options={\"include_usage\": True}\n        ):\n            if getattr(event, \"usage\", None):\n                yield ChatMessageStreamDelta(\n                    content=\"\",\n                    token_usage=TokenUsage(\n                        input_tokens=event.usage.prompt_tokens,\n                        output_tokens=event.usage.completion_tokens,\n                    ),\n                )\n            if event.choices:\n                choice = event.choices[0]\n                if choice.delta:\n                    yield ChatMessageStreamDelta(\n                        content=choice.delta.content,\n                        tool_calls=[\n                            ChatMessageToolCallStreamDelta(\n                                index=delta.index,\n                                id=delta.id,\n                                type=delta.type,\n                                function=delta.function,\n                            )\n                            for delta in choice.delta.tool_calls\n                        ]\n                        if choice.delta.tool_calls\n                        else None,\n                    )\n                else:\n                    if not getattr(choice, \"finish_reason\", None):\n                        raise ValueError(f\"No content or tool calls in event: {event}\")\n\n\nclass LiteLLMRouterModel(LiteLLMModel):\n    \"\"\"Router‑based client for interacting with the [LiteLLM Python SDK Router](https://docs.litellm.ai/docs/routing).\n\n    This class provides a high-level interface for distributing requests among multiple language models using\n    the LiteLLM SDK's routing capabilities. It is responsible for initializing and configuring the router client,\n    applying custom role conversions, and managing message formatting to ensure seamless integration with various LLMs.\n\n    Parameters:\n        model_id (`str`):\n            Identifier for the model group to use from the model list (e.g., \"model-group-1\").\n        model_list (`list[dict[str, Any]]`):\n            Model configurations to be used for routing.\n            Each configuration should include the model group name and any necessary parameters.\n            For more details, refer to the [LiteLLM Routing](https://docs.litellm.ai/docs/routing#quick-start) documentation.\n        client_kwargs (`dict[str, Any]`, *optional*):\n            Additional configuration parameters for the Router client. For more details, see the\n            [LiteLLM Routing Configurations](https://docs.litellm.ai/docs/routing).\n        custom_role_conversions (`dict[str, str]`, *optional*):\n            Custom role conversion mapping to convert message roles in others.\n            Useful for specific models that do not support specific message roles like \"system\".\n        flatten_messages_as_text (`bool`, *optional*): Whether to flatten messages as text.\n            Defaults to `True` for models that start with \"ollama\", \"groq\", \"cerebras\".\n        **kwargs:\n            Additional keyword arguments to forward to the underlying LiteLLM Router completion call.\n\n    Example:\n    ```python\n    >>> import os\n    >>> from smolagents import CodeAgent, WebSearchTool, LiteLLMRouterModel\n    >>> os.environ[\"OPENAI_API_KEY\"] = \"\"\n    >>> os.environ[\"AWS_ACCESS_KEY_ID\"] = \"\"\n    >>> os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"\"\n    >>> os.environ[\"AWS_REGION\"] = \"\"\n    >>> llm_loadbalancer_model_list = [\n    ...     {\n    ...         \"model_name\": \"model-group-1\",\n    ...         \"litellm_params\": {\n    ...             \"model\": \"gpt-4o-mini\",\n    ...             \"api_key\": os.getenv(\"OPENAI_API_KEY\"),\n    ...         },\n    ...     },\n    ...     {\n    ...         \"model_name\": \"model-group-1\",\n    ...         \"litellm_params\": {\n    ...             \"model\": \"bedrock/anthropic.claude-3-sonnet-20240229-v1:0\",\n    ...             \"aws_access_key_id\": os.getenv(\"AWS_ACCESS_KEY_ID\"),\n    ...             \"aws_secret_access_key\": os.getenv(\"AWS_SECRET_ACCESS_KEY\"),\n    ...             \"aws_region_name\": os.getenv(\"AWS_REGION\"),\n    ...         },\n    ...     },\n    >>> ]\n    >>> model = LiteLLMRouterModel(\n    ...    model_id=\"model-group-1\",\n    ...    model_list=llm_loadbalancer_model_list,\n    ...    client_kwargs={\n    ...        \"routing_strategy\":\"simple-shuffle\"\n    ...    }\n    >>> )\n    >>> agent = CodeAgent(tools=[WebSearchTool()], model=model)\n    >>> agent.run(\"How many seconds would it take for a leopard at full speed to run through Pont des Arts?\")\n    ```\n    \"\"\"\n\n    def __init__(\n        self,\n        model_id: str,\n        model_list: list[dict[str, Any]],\n        client_kwargs: dict[str, Any] | None = None,\n        custom_role_conversions: dict[str, str] | None = None,\n        flatten_messages_as_text: bool | None = None,\n        **kwargs,\n    ):\n        self.client_kwargs = {\n            \"model_list\": model_list,\n            **(client_kwargs or {}),\n        }\n        super().__init__(\n            model_id=model_id,\n            custom_role_conversions=custom_role_conversions,\n            flatten_messages_as_text=flatten_messages_as_text,\n            **kwargs,\n        )\n\n    def create_client(self):\n        try:\n            from litellm.router import Router\n        except ModuleNotFoundError as e:\n            raise ModuleNotFoundError(\n                \"Please install 'litellm' extra to use LiteLLMRouterModel: `pip install 'smolagents[litellm]'`\"\n            ) from e\n        return Router(**self.client_kwargs)\n\n\nclass InferenceClientModel(ApiModel):\n    \"\"\"A class to interact with Hugging Face's Inference Providers for language model interaction.\n\n    This model allows you to communicate with Hugging Face's models using Inference Providers. It can be used in both serverless mode, with a dedicated endpoint, or even with a local URL, supporting features like stop sequences and grammar customization.\n\n    Providers include Cerebras, Cohere, Fal, Fireworks, HF-Inference, Hyperbolic, Nebius, Novita, Replicate, SambaNova, Together, and more.\n\n    Parameters:\n        model_id (`str`, *optional*, default `\"Qwen/Qwen3-Next-80B-A3B-Thinking\"`):\n            The Hugging Face model ID to be used for inference.\n            This can be a model identifier from the Hugging Face model hub or a URL to a deployed Inference Endpoint.\n            Currently, it defaults to `\"Qwen/Qwen3-Next-80B-A3B-Thinking\"`, but this may change in the future.\n        provider (`str`, *optional*):\n            Name of the provider to use for inference. A list of supported providers can be found in the [Inference Providers documentation](https://huggingface.co/docs/inference-providers/index#partners).\n            Defaults to \"auto\" i.e. the first of the providers available for the model, sorted by the user's order [here](https://hf.co/settings/inference-providers).\n            If `base_url` is passed, then `provider` is not used.\n        token (`str`, *optional*):\n            Token used by the Hugging Face API for authentication. This token need to be authorized 'Make calls to the serverless Inference Providers'.\n            If the model is gated (like Llama-3 models), the token also needs 'Read access to contents of all public gated repos you can access'.\n            If not provided, the class will try to use environment variable 'HF_TOKEN', else use the token stored in the Hugging Face CLI configuration.\n        timeout (`int`, *optional*, defaults to 120):\n            Timeout for the API request, in seconds.\n        client_kwargs (`dict[str, Any]`, *optional*):\n            Additional keyword arguments to pass to the Hugging Face InferenceClient.\n        custom_role_conversions (`dict[str, str]`, *optional*):\n            Custom role conversion mapping to convert message roles in others.\n            Useful for specific models that do not support specific message roles like \"system\".\n        api_key (`str`, *optional*):\n            Token to use for authentication. This is a duplicated argument from `token` to make [`InferenceClientModel`]\n            follow the same pattern as `openai.OpenAI` client. Cannot be used if `token` is set. Defaults to None.\n        bill_to (`str`, *optional*):\n            The billing account to use for the requests. By default the requests are billed on the user's account. Requests can only be billed to\n            an organization the user is a member of, and which has subscribed to Enterprise Hub.\n        base_url (`str`, `optional`):\n            Base URL to run inference. This is a duplicated argument from `model` to make [`InferenceClientModel`]\n            follow the same pattern as `openai.OpenAI` client. Cannot be used if `model` is set. Defaults to None.\n        **kwargs:\n            Additional keyword arguments to forward to the underlying Hugging Face InferenceClient completion call.\n\n    Raises:\n        ValueError:\n            If the model name is not provided.\n\n    Example:\n    ```python\n    >>> engine = InferenceClientModel(\n    ...     model_id=\"Qwen/Qwen3-Next-80B-A3B-Thinking\",\n    ...     provider=\"hyperbolic\",\n    ...     token=\"your_hf_token_here\",\n    ...     max_tokens=5000,\n    ... )\n    >>> messages = [{\"role\": \"user\", \"content\": \"Explain quantum mechanics in simple terms.\"}]\n    >>> response = engine(messages, stop_sequences=[\"END\"])\n    >>> print(response)\n    \"Quantum mechanics is the branch of physics that studies...\"\n    ```\n    \"\"\"\n\n    def __init__(\n        self,\n        model_id: str = \"Qwen/Qwen3-Next-80B-A3B-Thinking\",\n        provider: str | None = None,\n        token: str | None = None,\n        timeout: int = 120,\n        client_kwargs: dict[str, Any] | None = None,\n        custom_role_conversions: dict[str, str] | None = None,\n        api_key: str | None = None,\n        bill_to: str | None = None,\n        base_url: str | None = None,\n        **kwargs,\n    ):\n        if token is not None and api_key is not None:\n            raise ValueError(\n                \"Received both `token` and `api_key` arguments. Please provide only one of them.\"\n                \" `api_key` is an alias for `token` to make the API compatible with OpenAI's client.\"\n                \" It has the exact same behavior as `token`.\"\n            )\n        token = token if token is not None else api_key\n        if token is None:\n            token = os.getenv(\"HF_TOKEN\")\n        self.client_kwargs = {\n            **(client_kwargs or {}),\n            \"model\": model_id,\n            \"provider\": provider,\n            \"token\": token,\n            \"timeout\": timeout,\n            \"bill_to\": bill_to,\n            \"base_url\": base_url,\n        }\n        super().__init__(model_id=model_id, custom_role_conversions=custom_role_conversions, **kwargs)\n\n    def create_client(self):\n        \"\"\"Create the Hugging Face client.\"\"\"\n        from huggingface_hub import InferenceClient\n\n        return InferenceClient(**self.client_kwargs)\n\n    def generate(\n        self,\n        messages: list[ChatMessage | dict],\n        stop_sequences: list[str] | None = None,\n        response_format: dict[str, str] | None = None,\n        tools_to_call_from: list[Tool] | None = None,\n        **kwargs,\n    ) -> ChatMessage:\n        if response_format is not None and self.client_kwargs[\"provider\"] not in STRUCTURED_GENERATION_PROVIDERS:\n            raise ValueError(\n                \"InferenceClientModel only supports structured outputs with these providers:\"\n                + \", \".join(STRUCTURED_GENERATION_PROVIDERS)\n            )\n        completion_kwargs = self._prepare_completion_kwargs(\n            messages=messages,\n            stop_sequences=stop_sequences,\n            tools_to_call_from=tools_to_call_from,\n            # response_format=response_format,\n            convert_images_to_image_urls=True,\n            custom_role_conversions=self.custom_role_conversions,\n            **kwargs,\n        )\n        self._apply_rate_limit()\n        response = self.retryer(self.client.chat_completion, **completion_kwargs)\n        content = response.choices[0].message.content\n        if stop_sequences is not None and not self.supports_stop_parameter:\n            content = remove_content_after_stop_sequences(content, stop_sequences)\n        return ChatMessage(\n            role=response.choices[0].message.role,\n            content=content,\n            tool_calls=response.choices[0].message.tool_calls,\n            raw=response,\n            token_usage=TokenUsage(\n                input_tokens=response.usage.prompt_tokens,\n                output_tokens=response.usage.completion_tokens,\n            ),\n        )\n\n    def generate_stream(\n        self,\n        messages: list[ChatMessage | dict],\n        stop_sequences: list[str] | None = None,\n        response_format: dict[str, str] | None = None,\n        tools_to_call_from: list[Tool] | None = None,\n        **kwargs,\n    ) -> Generator[ChatMessageStreamDelta]:\n        completion_kwargs = self._prepare_completion_kwargs(\n            messages=messages,\n            stop_sequences=stop_sequences,\n            response_format=response_format,\n            tools_to_call_from=tools_to_call_from,\n            model=self.model_id,\n            custom_role_conversions=self.custom_role_conversions,\n            convert_images_to_image_urls=True,\n            **kwargs,\n        )\n        self._apply_rate_limit()\n        for event in self.retryer(\n            self.client.chat.completions.create,\n            **completion_kwargs,\n            stream=True,\n            stream_options={\"include_usage\": True},\n        ):\n            if getattr(event, \"usage\", None):\n                yield ChatMessageStreamDelta(\n                    content=\"\",\n                    token_usage=TokenUsage(\n                        input_tokens=event.usage.prompt_tokens,\n                        output_tokens=event.usage.completion_tokens,\n                    ),\n                )\n            if event.choices:\n                choice = event.choices[0]\n                if choice.delta:\n                    yield ChatMessageStreamDelta(\n                        content=choice.delta.content,\n                        tool_calls=[\n                            ChatMessageToolCallStreamDelta(\n                                index=delta.index,\n                                id=delta.id,\n                                type=delta.type,\n                                function=delta.function,\n                            )\n                            for delta in choice.delta.tool_calls\n                        ]\n                        if choice.delta.tool_calls\n                        else None,\n                    )\n                else:\n                    if not getattr(choice, \"finish_reason\", None):\n                        raise ValueError(f\"No content or tool calls in event: {event}\")\n\n\nclass OpenAIModel(ApiModel):\n    \"\"\"This model connects to an OpenAI-compatible API server.\n\n    Parameters:\n        model_id (`str`):\n            The model identifier to use on the server (e.g. \"gpt-5\").\n        api_base (`str`, *optional*):\n            The base URL of the OpenAI-compatible API server.\n        api_key (`str`, *optional*):\n            The API key to use for authentication.\n        organization (`str`, *optional*):\n            The organization to use for the API request.\n        project (`str`, *optional*):\n            The project to use for the API request.\n        client_kwargs (`dict[str, Any]`, *optional*):\n            Additional keyword arguments to pass to the OpenAI client (like organization, project, max_retries etc.).\n        custom_role_conversions (`dict[str, str]`, *optional*):\n            Custom role conversion mapping to convert message roles in others.\n            Useful for specific models that do not support specific message roles like \"system\".\n        flatten_messages_as_text (`bool`, default `False`):\n            Whether to flatten messages as text.\n        **kwargs:\n            Additional keyword arguments to forward to the underlying OpenAI API completion call, for instance `temperature`.\n    \"\"\"\n\n    def __init__(\n        self,\n        model_id: str,\n        api_base: str | None = None,\n        api_key: str | None = None,\n        organization: str | None = None,\n        project: str | None = None,\n        client_kwargs: dict[str, Any] | None = None,\n        custom_role_conversions: dict[str, str] | None = None,\n        flatten_messages_as_text: bool = False,\n        **kwargs,\n    ):\n        self.client_kwargs = {\n            **(client_kwargs or {}),\n            \"api_key\": api_key,\n            \"base_url\": api_base,\n            \"organization\": organization,\n            \"project\": project,\n        }\n        super().__init__(\n            model_id=model_id,\n            custom_role_conversions=custom_role_conversions,\n            flatten_messages_as_text=flatten_messages_as_text,\n            **kwargs,\n        )\n\n    def create_client(self):\n        try:\n            import openai\n        except ModuleNotFoundError as e:\n            raise ModuleNotFoundError(\n                \"Please install 'openai' extra to use OpenAIModel: `pip install 'smolagents[openai]'`\"\n            ) from e\n\n        return openai.OpenAI(**self.client_kwargs)\n\n    def generate_stream(\n        self,\n        messages: list[ChatMessage | dict],\n        stop_sequences: list[str] | None = None,\n        response_format: dict[str, str] | None = None,\n        tools_to_call_from: list[Tool] | None = None,\n        **kwargs,\n    ) -> Generator[ChatMessageStreamDelta]:\n        completion_kwargs = self._prepare_completion_kwargs(\n            messages=messages,\n            stop_sequences=stop_sequences,\n            response_format=response_format,\n            tools_to_call_from=tools_to_call_from,\n            model=self.model_id,\n            custom_role_conversions=self.custom_role_conversions,\n            convert_images_to_image_urls=True,\n            **kwargs,\n        )\n        self._apply_rate_limit()\n        for event in self.retryer(\n            self.client.chat.completions.create,\n            **completion_kwargs,\n            stream=True,\n            stream_options={\"include_usage\": True},\n        ):\n            if event.usage:\n                yield ChatMessageStreamDelta(\n                    content=\"\",\n                    token_usage=TokenUsage(\n                        input_tokens=event.usage.prompt_tokens,\n                        output_tokens=event.usage.completion_tokens,\n                    ),\n                )\n            if event.choices:\n                choice = event.choices[0]\n                if choice.delta:\n                    yield ChatMessageStreamDelta(\n                        content=choice.delta.content,\n                        tool_calls=[\n                            ChatMessageToolCallStreamDelta(\n                                index=delta.index,\n                                id=delta.id,\n                                type=delta.type,\n                                function=delta.function,\n                            )\n                            for delta in choice.delta.tool_calls\n                        ]\n                        if choice.delta.tool_calls\n                        else None,\n                    )\n                else:\n                    if not getattr(choice, \"finish_reason\", None):\n                        raise ValueError(f\"No content or tool calls in event: {event}\")\n\n    def generate(\n        self,\n        messages: list[ChatMessage | dict],\n        stop_sequences: list[str] | None = None,\n        response_format: dict[str, str] | None = None,\n        tools_to_call_from: list[Tool] | None = None,\n        **kwargs,\n    ) -> ChatMessage:\n        completion_kwargs = self._prepare_completion_kwargs(\n            messages=messages,\n            stop_sequences=stop_sequences,\n            response_format=response_format,\n            tools_to_call_from=tools_to_call_from,\n            model=self.model_id,\n            custom_role_conversions=self.custom_role_conversions,\n            convert_images_to_image_urls=True,\n            **kwargs,\n        )\n        self._apply_rate_limit()\n        response = self.retryer(self.client.chat.completions.create, **completion_kwargs)\n        content = response.choices[0].message.content\n        if stop_sequences is not None and not self.supports_stop_parameter:\n            content = remove_content_after_stop_sequences(content, stop_sequences)\n        return ChatMessage(\n            role=response.choices[0].message.role,\n            content=content,\n            tool_calls=response.choices[0].message.tool_calls,\n            raw=response,\n            token_usage=TokenUsage(\n                input_tokens=response.usage.prompt_tokens,\n                output_tokens=response.usage.completion_tokens,\n            ),\n        )\n\n\nOpenAIServerModel = OpenAIModel\n\n\nclass AzureOpenAIModel(OpenAIModel):\n    \"\"\"This model connects to an Azure OpenAI deployment.\n\n    Parameters:\n        model_id (`str`):\n            The model deployment name to use when connecting (e.g. \"gpt-4o-mini\").\n        azure_endpoint (`str`, *optional*):\n            The Azure endpoint, including the resource, e.g. `https://example-resource.azure.openai.com/`. If not provided, it will be inferred from the `AZURE_OPENAI_ENDPOINT` environment variable.\n        api_key (`str`, *optional*):\n            The API key to use for authentication. If not provided, it will be inferred from the `AZURE_OPENAI_API_KEY` environment variable.\n        api_version (`str`, *optional*):\n            The API version to use. If not provided, it will be inferred from the `OPENAI_API_VERSION` environment variable.\n        client_kwargs (`dict[str, Any]`, *optional*):\n            Additional keyword arguments to pass to the AzureOpenAI client (like organization, project, max_retries etc.).\n        custom_role_conversions (`dict[str, str]`, *optional*):\n            Custom role conversion mapping to convert message roles in others.\n            Useful for specific models that do not support specific message roles like \"system\".\n        **kwargs:\n            Additional keyword arguments to forward to the underlying Azure OpenAI API completion call.\n    \"\"\"\n\n    def __init__(\n        self,\n        model_id: str,\n        azure_endpoint: str | None = None,\n        api_key: str | None = None,\n        api_version: str | None = None,\n        client_kwargs: dict[str, Any] | None = None,\n        custom_role_conversions: dict[str, str] | None = None,\n        **kwargs,\n    ):\n        client_kwargs = client_kwargs or {}\n        client_kwargs.update(\n            {\n                \"api_version\": api_version,\n                \"azure_endpoint\": azure_endpoint,\n            }\n        )\n        super().__init__(\n            model_id=model_id,\n            api_key=api_key,\n            client_kwargs=client_kwargs,\n            custom_role_conversions=custom_role_conversions,\n            **kwargs,\n        )\n\n    def create_client(self):\n        try:\n            import openai\n        except ModuleNotFoundError as e:\n            raise ModuleNotFoundError(\n                \"Please install 'openai' extra to use AzureOpenAIModel: `pip install 'smolagents[openai]'`\"\n            ) from e\n\n        return openai.AzureOpenAI(**self.client_kwargs)\n\n\nAzureOpenAIServerModel = AzureOpenAIModel\n\n\nclass AmazonBedrockModel(ApiModel):\n    \"\"\"\n    A model class for interacting with Amazon Bedrock Server models through the Bedrock API.\n\n    This class provides an interface to interact with various Bedrock language models,\n    allowing for customized model inference, guardrail configuration, message handling,\n    and other parameters allowed by boto3 API.\n\n    Authentication:\n\n    Amazon Bedrock supports multiple authentication methods:\n    - Default AWS credentials:\n       Use the default AWS credential chain (e.g., IAM roles, IAM users).\n    - API Key Authentication (requires `boto3 >= 1.39.0`):\n       Set the API key using the `AWS_BEARER_TOKEN_BEDROCK` environment variable.\n\n    > [!TIP]\n    > API key support requires `boto3 >= 1.39.0`.\n    > For users not relying on API key authentication, the minimum supported version is `boto3 >= 1.36.18`.\n\n    Parameters:\n        model_id (`str`):\n            The model identifier to use on Bedrock (e.g. \"us.amazon.nova-pro-v1:0\").\n        client (`boto3.client`, *optional*):\n            A custom boto3 client for AWS interactions. If not provided, a default client will be created.\n        client_kwargs (dict[str, Any], *optional*):\n            Keyword arguments used to configure the boto3 client if it needs to be created internally.\n            Examples include `region_name`, `config`, or `endpoint_url`.\n        custom_role_conversions (`dict[str, str]`, *optional*):\n            Custom role conversion mapping to convert message roles in others.\n            Useful for specific models that do not support specific message roles like \"system\".\n            Defaults to converting all roles to \"user\" role to enable using all the Bedrock models.\n        flatten_messages_as_text (`bool`, default `False`):\n            Whether to flatten messages as text.\n        **kwargs:\n            Additional keyword arguments to forward to the underlying Amazon Bedrock model converse call.\n\n    Examples:\n        Creating a model instance with default settings:\n        ```python\n        >>> bedrock_model = AmazonBedrockModel(\n        ...     model_id='us.amazon.nova-pro-v1:0'\n        ... )\n        ```\n\n        Creating a model instance with a custom boto3 client:\n        ```python\n        >>> import boto3\n        >>> client = boto3.client('bedrock-runtime', region_name='us-west-2')\n        >>> bedrock_model = AmazonBedrockModel(\n        ...     model_id='us.amazon.nova-pro-v1:0',\n        ...     client=client\n        ... )\n        ```\n\n        Creating a model instance with client_kwargs for internal client creation:\n        ```python\n        >>> bedrock_model = AmazonBedrockModel(\n        ...     model_id='us.amazon.nova-pro-v1:0',\n        ...     client_kwargs={'region_name': 'us-west-2', 'endpoint_url': 'https://custom-endpoint.com'}\n        ... )\n        ```\n\n        Creating a model instance with inference and guardrail configurations:\n        ```python\n        >>> additional_api_config = {\n        ...     \"inferenceConfig\": {\n        ...         \"maxTokens\": 3000\n        ...     },\n        ...     \"guardrailConfig\": {\n        ...         \"guardrailIdentifier\": \"identify1\",\n        ...         \"guardrailVersion\": 'v1'\n        ...     },\n        ... }\n        >>> bedrock_model = AmazonBedrockModel(\n        ...     model_id='anthropic.claude-3-haiku-20240307-v1:0',\n        ...     **additional_api_config\n        ... )\n        ```\n    \"\"\"\n\n    def __init__(\n        self,\n        model_id: str,\n        client=None,\n        client_kwargs: dict[str, Any] | None = None,\n        custom_role_conversions: dict[str, str] | None = None,\n        **kwargs,\n    ):\n        self.client_kwargs = client_kwargs or {}\n\n        # Bedrock only supports `assistant` and `user` roles.\n        # Many Bedrock models do not allow conversations to start with the `assistant` role, so the default is set to `user/user`.\n        # This parameter is retained for future model implementations and extended support.\n        custom_role_conversions = custom_role_conversions or {\n            MessageRole.SYSTEM: MessageRole.USER,\n            MessageRole.ASSISTANT: MessageRole.USER,\n            MessageRole.TOOL_CALL: MessageRole.USER,\n            MessageRole.TOOL_RESPONSE: MessageRole.USER,\n        }\n\n        super().__init__(\n            model_id=model_id,\n            custom_role_conversions=custom_role_conversions,\n            flatten_messages_as_text=False,  # Bedrock API doesn't support flatten messages, must be a list of messages\n            client=client,\n            **kwargs,\n        )\n\n    def _prepare_completion_kwargs(\n        self,\n        messages: list[ChatMessage | dict],\n        stop_sequences: list[str] | None = None,\n        response_format: dict[str, str] | None = None,\n        tools_to_call_from: list[Tool] | None = None,\n        custom_role_conversions: dict[str, str] | None = None,\n        convert_images_to_image_urls: bool = False,\n        tool_choice: str | dict[Any, Any] | None = None,\n        **kwargs,\n    ) -> dict:\n        \"\"\"\n        Overrides the base method to handle Bedrock-specific configurations.\n\n        This implementation adapts the completion keyword arguments to align with\n        Bedrock's requirements, ensuring compatibility with its unique setup and\n        constraints.\n        \"\"\"\n        completion_kwargs = super()._prepare_completion_kwargs(\n            messages=messages,\n            stop_sequences=None,  # Bedrock support stop_sequence using Inference Config\n            tools_to_call_from=tools_to_call_from,\n            custom_role_conversions=custom_role_conversions,\n            convert_images_to_image_urls=convert_images_to_image_urls,\n            **kwargs,\n        )\n        # Not all models in Bedrock support `toolConfig`. Also, smolagents already include the tool call in the prompt,\n        # so adding `toolConfig` could cause conflicts. We remove it to avoid issues.\n        completion_kwargs.pop(\"toolConfig\", None)\n\n        # The Bedrock API does not support the `type` key in requests.\n        # This block of code modifies the object to meet Bedrock's requirements.\n        for message in completion_kwargs.get(\"messages\", []):\n            for content in message.get(\"content\", []):\n                if \"type\" in content:\n                    del content[\"type\"]\n\n        return {\n            \"modelId\": self.model_id,\n            **completion_kwargs,\n        }\n\n    def create_client(self):\n        try:\n            import boto3  # type: ignore\n        except ModuleNotFoundError as e:\n            raise ModuleNotFoundError(\n                \"Please install 'bedrock' extra to use AmazonBedrockServerModel: `pip install 'smolagents[bedrock]'`\"\n            ) from e\n\n        return boto3.client(\"bedrock-runtime\", **self.client_kwargs)\n\n    def generate(\n        self,\n        messages: list[ChatMessage | dict],\n        stop_sequences: list[str] | None = None,\n        response_format: dict[str, str] | None = None,\n        tools_to_call_from: list[Tool] | None = None,\n        **kwargs,\n    ) -> ChatMessage:\n        if response_format is not None:\n            raise ValueError(\"Amazon Bedrock does not support response_format\")\n        completion_kwargs: dict = self._prepare_completion_kwargs(\n            messages=messages,\n            tools_to_call_from=tools_to_call_from,\n            custom_role_conversions=self.custom_role_conversions,\n            convert_images_to_image_urls=True,\n            **kwargs,\n        )\n        self._apply_rate_limit()\n        # self.client is created in ApiModel class\n        response = self.retryer(self.client.converse, **completion_kwargs)\n\n        # Get content blocks with \"text\" key: in case thinking blocks are present, discard them\n        message_content_blocks_with_text = [\n            block for block in response[\"output\"][\"message\"][\"content\"] if \"text\" in block\n        ]\n        if not message_content_blocks_with_text:\n            raise KeyError(\"No message content blocks with 'text' key found in response\")\n        # Keep the last one\n        content = message_content_blocks_with_text[-1][\"text\"]\n        if stop_sequences is not None and not self.supports_stop_parameter:\n            content = remove_content_after_stop_sequences(content, stop_sequences)\n        return ChatMessage(\n            role=response[\"output\"][\"message\"][\"role\"],\n            content=content,\n            tool_calls=response[\"output\"][\"message\"][\"tool_calls\"],\n            raw=response,\n            token_usage=TokenUsage(\n                input_tokens=response[\"usage\"][\"inputTokens\"],\n                output_tokens=response[\"usage\"][\"outputTokens\"],\n            ),\n        )\n\n\nAmazonBedrockServerModel = AmazonBedrockModel\n\n\n# Model Registry for secure deserialization\n# This registry maps model class names to their actual classes.\n# Only classes listed here can be instantiated during deserialization (from_dict).\n# This prevents arbitrary code execution via importlib-based dynamic loading.\nMODEL_REGISTRY = {\n    \"VLLMModel\": VLLMModel,\n    \"MLXModel\": MLXModel,\n    \"TransformersModel\": TransformersModel,\n    \"LiteLLMModel\": LiteLLMModel,\n    \"LiteLLMRouterModel\": LiteLLMRouterModel,\n    \"InferenceClientModel\": InferenceClientModel,\n    \"OpenAIModel\": OpenAIModel,\n    \"AzureOpenAIModel\": AzureOpenAIModel,\n    \"AmazonBedrockModel\": AmazonBedrockModel,\n}\n\n__all__ = [\n    \"REMOVE_PARAMETER\",\n    \"MessageRole\",\n    \"tool_role_conversions\",\n    \"get_clean_message_list\",\n    \"Model\",\n    \"MLXModel\",\n    \"TransformersModel\",\n    \"ApiModel\",\n    \"InferenceClientModel\",\n    \"LiteLLMModel\",\n    \"LiteLLMRouterModel\",\n    \"OpenAIServerModel\",\n    \"OpenAIModel\",\n    \"VLLMModel\",\n    \"AzureOpenAIServerModel\",\n    \"AzureOpenAIModel\",\n    \"AmazonBedrockServerModel\",\n    \"AmazonBedrockModel\",\n    \"ChatMessage\",\n]\n"
  },
  {
    "path": "src/smolagents/monitoring.py",
    "content": "#!/usr/bin/env python\n# coding=utf-8\n\n# Copyright 2024 The HuggingFace Inc. team. All rights reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\nimport json\nfrom dataclasses import dataclass, field\nfrom enum import IntEnum\n\nfrom rich import box\nfrom rich.console import Console, Group\nfrom rich.panel import Panel\nfrom rich.rule import Rule\nfrom rich.syntax import Syntax\nfrom rich.table import Table\nfrom rich.text import Text\nfrom rich.tree import Tree\n\nfrom smolagents.utils import sanitize_for_rich\n\n\n__all__ = [\"AgentLogger\", \"LogLevel\", \"Monitor\", \"TokenUsage\", \"Timing\"]\n\n\n@dataclass\nclass TokenUsage:\n    \"\"\"\n    Contains the token usage information for a given step or run.\n    \"\"\"\n\n    input_tokens: int\n    output_tokens: int\n    total_tokens: int = field(init=False)\n\n    def __post_init__(self):\n        self.total_tokens = self.input_tokens + self.output_tokens\n\n    def dict(self):\n        return {\n            \"input_tokens\": self.input_tokens,\n            \"output_tokens\": self.output_tokens,\n            \"total_tokens\": self.total_tokens,\n        }\n\n\n@dataclass\nclass Timing:\n    \"\"\"\n    Contains the timing information for a given step or run.\n    \"\"\"\n\n    start_time: float\n    end_time: float | None = None\n\n    @property\n    def duration(self):\n        return None if self.end_time is None else self.end_time - self.start_time\n\n    def dict(self):\n        return {\n            \"start_time\": self.start_time,\n            \"end_time\": self.end_time,\n            \"duration\": self.duration,\n        }\n\n    def __repr__(self) -> str:\n        return f\"Timing(start_time={self.start_time}, end_time={self.end_time}, duration={self.duration})\"\n\n\nclass Monitor:\n    def __init__(self, tracked_model, logger):\n        self.step_durations = []\n        self.tracked_model = tracked_model\n        self.logger = logger\n        self.total_input_token_count = 0\n        self.total_output_token_count = 0\n\n    def get_total_token_counts(self) -> TokenUsage:\n        return TokenUsage(\n            input_tokens=self.total_input_token_count,\n            output_tokens=self.total_output_token_count,\n        )\n\n    def reset(self):\n        self.step_durations = []\n        self.total_input_token_count = 0\n        self.total_output_token_count = 0\n\n    def update_metrics(self, step_log):\n        \"\"\"Update the metrics of the monitor.\n\n        Args:\n            step_log ([`MemoryStep`]): Step log to update the monitor with.\n        \"\"\"\n        step_duration = step_log.timing.duration\n        self.step_durations.append(step_duration)\n        console_outputs = f\"[Step {len(self.step_durations)}: Duration {step_duration:.2f} seconds\"\n\n        if step_log.token_usage is not None:\n            self.total_input_token_count += step_log.token_usage.input_tokens\n            self.total_output_token_count += step_log.token_usage.output_tokens\n            console_outputs += (\n                f\"| Input tokens: {self.total_input_token_count:,} | Output tokens: {self.total_output_token_count:,}\"\n            )\n        console_outputs += \"]\"\n        self.logger.log(Text(console_outputs, style=\"dim\"), level=1)\n\n\nclass LogLevel(IntEnum):\n    OFF = -1  # No output\n    ERROR = 0  # Only errors\n    INFO = 1  # Normal output (default)\n    DEBUG = 2  # Detailed output\n\n\nYELLOW_HEX = \"#d4b702\"\n\n\nclass AgentLogger:\n    def __init__(self, level: LogLevel = LogLevel.INFO, console: Console | None = None):\n        self.level = level\n        if console is None:\n            self.console = Console(highlight=False)\n        else:\n            self.console = console\n\n    def log(self, *args, level: int | str | LogLevel = LogLevel.INFO, **kwargs) -> None:\n        \"\"\"Logs a message to the console.\n\n        Args:\n            level (LogLevel, optional): Defaults to LogLevel.INFO.\n        \"\"\"\n        if isinstance(level, str):\n            level = LogLevel[level.upper()]\n        if level <= self.level:\n            self.console.print(*args, **kwargs)\n\n    def log_error(self, error_message: str) -> None:\n        self.log(Text(sanitize_for_rich(error_message), style=\"bold red\"), level=LogLevel.ERROR)\n\n    def log_markdown(self, content: str, title: str | None = None, level=LogLevel.INFO, style=YELLOW_HEX) -> None:\n        markdown_content = Syntax(\n            content,\n            lexer=\"markdown\",\n            theme=\"github-dark\",\n            word_wrap=True,\n        )\n        if title:\n            self.log(\n                Group(\n                    Rule(\n                        \"[bold italic]\" + title,\n                        align=\"left\",\n                        style=style,\n                    ),\n                    markdown_content,\n                ),\n                level=level,\n            )\n        else:\n            self.log(markdown_content, level=level)\n\n    def log_code(self, title: str, content: str, level: int = LogLevel.INFO) -> None:\n        self.log(\n            Panel(\n                Syntax(\n                    content,\n                    lexer=\"python\",\n                    theme=\"monokai\",\n                    word_wrap=True,\n                ),\n                title=\"[bold]\" + title,\n                title_align=\"left\",\n                box=box.HORIZONTALS,\n            ),\n            level=level,\n        )\n\n    def log_rule(self, title: str, level: int = LogLevel.INFO) -> None:\n        self.log(\n            Rule(\n                \"[bold white]\" + title,\n                characters=\"━\",\n                style=YELLOW_HEX,\n            ),\n            level=LogLevel.INFO,\n        )\n\n    def log_task(self, content: str, subtitle: str, title: str | None = None, level: LogLevel = LogLevel.INFO) -> None:\n        # Important: `content` can contain arbitrary tool logs / payloads. If we embed it\n        # inside Rich markup (e.g. f\"[bold]{content}\"), any stray \"[/...]\" sequences or\n        # binary-ish characters can crash Rich's markup parser. Render the content as\n        # `Text` instead, and apply styling via Text/style, not markup.\n        safe_content = sanitize_for_rich(content)\n        safe_subtitle = sanitize_for_rich(subtitle)\n        content_text = Text(\"\\n\") + Text(safe_content, style=\"bold\") + Text(\"\\n\")\n        subtitle_text = Text(safe_subtitle)\n        self.log(\n            Panel(\n                content_text,\n                title=\"[bold]New run\" + (f\" - {title}\" if title else \"\"),\n                subtitle=subtitle_text,\n                border_style=YELLOW_HEX,\n                subtitle_align=\"left\",\n            ),\n            level=level,\n        )\n\n    def log_messages(self, messages: list[dict], level: LogLevel = LogLevel.DEBUG) -> None:\n        messages_as_string = \"\\n\".join([json.dumps(message.dict(), indent=4) for message in messages])\n        self.log(\n            Syntax(\n                messages_as_string,\n                lexer=\"markdown\",\n                theme=\"github-dark\",\n                word_wrap=True,\n            ),\n            level=level,\n        )\n\n    def visualize_agent_tree(self, agent):\n        def create_tools_section(tools_dict):\n            table = Table(show_header=True, header_style=\"bold\")\n            table.add_column(\"Name\", style=\"#1E90FF\")\n            table.add_column(\"Description\")\n            table.add_column(\"Arguments\")\n\n            for name, tool in tools_dict.items():\n                args = [\n                    f\"{arg_name} (`{info.get('type', 'Any')}`{', optional' if info.get('optional') else ''}): {info.get('description', '')}\"\n                    for arg_name, info in getattr(tool, \"inputs\", {}).items()\n                ]\n                table.add_row(name, getattr(tool, \"description\", str(tool)), \"\\n\".join(args))\n\n            return Group(\"🛠️ [italic #1E90FF]Tools:[/italic #1E90FF]\", table)\n\n        def get_agent_headline(agent, name: str | None = None):\n            name_headline = f\"{name} | \" if name else \"\"\n            return f\"[bold {YELLOW_HEX}]{name_headline}{agent.__class__.__name__} | {agent.model.model_id}\"\n\n        def build_agent_tree(parent_tree, agent_obj):\n            \"\"\"Recursively builds the agent tree.\"\"\"\n            parent_tree.add(create_tools_section(agent_obj.tools))\n\n            if agent_obj.managed_agents:\n                agents_branch = parent_tree.add(\"🤖 [italic #1E90FF]Managed agents:\")\n                for name, managed_agent in agent_obj.managed_agents.items():\n                    agent_tree = agents_branch.add(get_agent_headline(managed_agent, name))\n                    if managed_agent.__class__.__name__ == \"CodeAgent\":\n                        agent_tree.add(\n                            f\"✅ [italic #1E90FF]Authorized imports:[/italic #1E90FF] {managed_agent.additional_authorized_imports}\"\n                        )\n                    agent_tree.add(f\"📝 [italic #1E90FF]Description:[/italic #1E90FF] {managed_agent.description}\")\n                    build_agent_tree(agent_tree, managed_agent)\n\n        main_tree = Tree(get_agent_headline(agent))\n        if agent.__class__.__name__ == \"CodeAgent\":\n            main_tree.add(\n                f\"✅ [italic #1E90FF]Authorized imports:[/italic #1E90FF] {agent.additional_authorized_imports}\"\n            )\n        build_agent_tree(main_tree, agent)\n        self.console.print(main_tree)\n"
  },
  {
    "path": "src/smolagents/prompts/code_agent.yaml",
    "content": "system_prompt: |-\n  You are an expert assistant who can solve any task using code blobs. You will be given a task to solve as best you can.\n  To do so, you have been given access to a list of tools: these tools are basically Python functions which you can call with code.\n  To solve the task, you must plan forward to proceed in a series of steps, in a cycle of Thought, Code, and Observation sequences.\n\n  At each step, in the 'Thought:' sequence, you should first explain your reasoning towards solving the task and the tools that you want to use.\n  Then in the Code sequence you should write the code in simple Python. The code sequence must be opened with '{{code_block_opening_tag}}', and closed with '{{code_block_closing_tag}}'.\n  During each intermediate step, you can use 'print()' to save whatever important information you will then need.\n  These print outputs will then appear in the 'Observation:' field, which will be available as input for the next step.\n  In the end you have to return a final answer using the `final_answer` tool.\n\n  Here are a few examples using notional tools:\n  ---\n  Task: \"Generate an image of the oldest person in this document.\"\n\n  Thought: I will proceed step by step and use the following tools: `document_qa` to find the oldest person in the document, then `image_generator` to generate an image according to the answer.\n  {{code_block_opening_tag}}\n  answer = document_qa(document=document, question=\"Who is the oldest person mentioned?\")\n  print(answer)\n  {{code_block_closing_tag}}\n  Observation: \"The oldest person in the document is John Doe, a 55 year old lumberjack living in Newfoundland.\"\n\n  Thought: I will now generate an image showcasing the oldest person.\n  {{code_block_opening_tag}}\n  image = image_generator(\"A portrait of John Doe, a 55-year-old man living in Canada.\")\n  final_answer(image)\n  {{code_block_closing_tag}}\n\n  ---\n  Task: \"What is the result of the following operation: 5 + 3 + 1294.678?\"\n\n  Thought: I will use Python code to compute the result of the operation and then return the final answer using the `final_answer` tool.\n  {{code_block_opening_tag}}\n  result = 5 + 3 + 1294.678\n  final_answer(result)\n  {{code_block_closing_tag}}\n\n  ---\n  Task:\n  \"Answer the question in the variable `question` about the image stored in the variable `image`. The question is in French.\n  You have been provided with these additional arguments, that you can access using the keys as variables in your Python code:\n  {'question': 'Quel est l'animal sur l'image?', 'image': 'path/to/image.jpg'}\"\n\n  Thought: I will use the following tools: `translator` to translate the question into English and then `image_qa` to answer the question on the input image.\n  {{code_block_opening_tag}}\n  translated_question = translator(question=question, src_lang=\"French\", tgt_lang=\"English\")\n  print(f\"The translated question is {translated_question}.\")\n  answer = image_qa(image=image, question=translated_question)\n  final_answer(f\"The answer is {answer}\")\n  {{code_block_closing_tag}}\n\n  ---\n  Task:\n  In a 1979 interview, Stanislaus Ulam discusses with Martin Sherwin about other great physicists of his time, including Oppenheimer.\n  What does he say was the consequence of Einstein learning too much math on his creativity, in one word?\n\n  Thought: I need to find and read the 1979 interview of Stanislaus Ulam with Martin Sherwin.\n  {{code_block_opening_tag}}\n  pages = web_search(query=\"1979 interview Stanislaus Ulam Martin Sherwin physicists Einstein\")\n  print(pages)\n  {{code_block_closing_tag}}\n  Observation:\n  No result found for query \"1979 interview Stanislaus Ulam Martin Sherwin physicists Einstein\".\n\n  Thought: The query was maybe too restrictive and did not find any results. Let's try again with a broader query.\n  {{code_block_opening_tag}}\n  pages = web_search(query=\"1979 interview Stanislaus Ulam\")\n  print(pages)\n  {{code_block_closing_tag}}\n  Observation:\n  Found 6 pages:\n  [Stanislaus Ulam 1979 interview](https://ahf.nuclearmuseum.org/voices/oral-histories/stanislaus-ulams-interview-1979/)\n\n  [Ulam discusses Manhattan Project](https://ahf.nuclearmuseum.org/manhattan-project/ulam-manhattan-project/)\n\n  (truncated)\n\n  Thought: I will read the first 2 pages to know more.\n  {{code_block_opening_tag}}\n  for url in [\"https://ahf.nuclearmuseum.org/voices/oral-histories/stanislaus-ulams-interview-1979/\", \"https://ahf.nuclearmuseum.org/manhattan-project/ulam-manhattan-project/\"]:\n      whole_page = visit_webpage(url)\n      print(whole_page)\n      print(\"\\n\" + \"=\"*80 + \"\\n\")  # Print separator between pages\n  {{code_block_closing_tag}}\n  Observation:\n  Manhattan Project Locations:\n  Los Alamos, NM\n  Stanislaus Ulam was a Polish-American mathematician. He worked on the Manhattan Project at Los Alamos and later helped design the hydrogen bomb. In this interview, he discusses his work at\n  (truncated)\n\n  Thought: I now have the final answer: from the webpages visited, Stanislaus Ulam says of Einstein: \"He learned too much mathematics and sort of diminished, it seems to me personally, it seems to me his purely physics creativity.\" Let's answer in one word.\n  {{code_block_opening_tag}}\n  final_answer(\"diminished\")\n  {{code_block_closing_tag}}\n\n  ---\n  Task: \"Which city has the highest population: Guangzhou or Shanghai?\"\n\n  Thought: I need to get the populations for both cities and compare them: I will use the tool `web_search` to get the population of both cities.\n  {{code_block_opening_tag}}\n  for city in [\"Guangzhou\", \"Shanghai\"]:\n      print(f\"Population {city}:\", web_search(f\"{city} population\"))\n  {{code_block_closing_tag}}\n  Observation:\n  Population Guangzhou: ['Guangzhou has a population of 15 million inhabitants as of 2021.']\n  Population Shanghai: '26 million (2019)'\n\n  Thought: Now I know that Shanghai has the highest population.\n  {{code_block_opening_tag}}\n  final_answer(\"Shanghai\")\n  {{code_block_closing_tag}}\n\n  ---\n  Task: \"What is the current age of the pope, raised to the power 0.36?\"\n\n  Thought: I will use the tool `wikipedia_search` to get the age of the pope, and confirm that with a web search.\n  {{code_block_opening_tag}}\n  pope_age_wiki = wikipedia_search(query=\"current pope age\")\n  print(\"Pope age as per wikipedia:\", pope_age_wiki)\n  pope_age_search = web_search(query=\"current pope age\")\n  print(\"Pope age as per google search:\", pope_age_search)\n  {{code_block_closing_tag}}\n  Observation:\n  Pope age: \"The pope Francis is currently 88 years old.\"\n\n  Thought: I know that the pope is 88 years old. Let's compute the result using Python code.\n  {{code_block_opening_tag}}\n  pope_current_age = 88 ** 0.36\n  final_answer(pope_current_age)\n  {{code_block_closing_tag}}\n\n  Above examples were using notional tools that might not exist for you. On top of performing computations in the Python code snippets that you create, you only have access to these tools, behaving like regular python functions:\n  {{code_block_opening_tag}}\n  {%- for tool in tools.values() %}\n  {{ tool.to_code_prompt() }}\n  {% endfor %}\n  {{code_block_closing_tag}}\n\n  {%- if managed_agents and managed_agents.values() | list %}\n  You can also give tasks to team members.\n  Calling a team member works similarly to calling a tool: provide the task description as the 'task' argument. Since this team member is a real human, be as detailed and verbose as necessary in your task description.\n  You can also include any relevant variables or context using the 'additional_args' argument.\n  Here is a list of the team members that you can call:\n  {{code_block_opening_tag}}\n  {%- for agent in managed_agents.values() %}\n  def {{ agent.name }}(task: str, additional_args: dict[str, Any]) -> str:\n      \"\"\"{{ agent.description }}\n\n      Args:\n          task: Long detailed description of the task.\n          additional_args: Dictionary of extra inputs to pass to the managed agent, e.g. images, dataframes, or any other contextual data it may need.\n      \"\"\"\n  {% endfor %}\n  {{code_block_closing_tag}}\n  {%- endif %}\n\n  Here are the rules you should always follow to solve your task:\n  1. Always provide a 'Thought:' sequence, and a '{{code_block_opening_tag}}' sequence ending with '{{code_block_closing_tag}}', else you will fail.\n  2. Use only variables that you have defined!\n  3. Always use the right arguments for the tools. DO NOT pass the arguments as a dict as in 'answer = wikipedia_search({'query': \"What is the place where James Bond lives?\"})', but use the arguments directly as in 'answer = wikipedia_search(query=\"What is the place where James Bond lives?\")'.\n  4. For tools WITHOUT JSON output schema: Take care to not chain too many sequential tool calls in the same code block, as their output format is unpredictable. For instance, a call to wikipedia_search without a JSON output schema has an unpredictable return format, so do not have another tool call that depends on its output in the same block: rather output results with print() to use them in the next block.\n  5. For tools WITH JSON output schema: You can confidently chain multiple tool calls and directly access structured output fields in the same code block! When a tool has a JSON output schema, you know exactly what fields and data types to expect, allowing you to write robust code that directly accesses the structured response (e.g., result['field_name']) without needing intermediate print() statements.\n  6. Call a tool only when needed, and never re-do a tool call that you previously did with the exact same parameters.\n  7. Don't name any new variable with the same name as a tool: for instance don't name a variable 'final_answer'.\n  8. Never create any notional variables in our code, as having these in your logs will derail you from the true variables.\n  9. You can use imports in your code, but only from the following list of modules: {{authorized_imports}}\n  10. The state persists between code executions: so if in one step you've created variables or imported modules, these will all persist.\n  11. Don't give up! You're in charge of solving the task, not providing directions to solve it.\n\n  {%- if custom_instructions %}\n  {{custom_instructions}}\n  {%- endif %}\n\n  Now Begin!\nplanning:\n  initial_plan : |-\n    You are a world expert at analyzing a situation to derive facts, and plan accordingly towards solving a task.\n    Below I will present you a task. You will need to 1. build a survey of facts known or needed to solve the task, then 2. make a plan of action to solve the task.\n\n    ## 1. Facts survey\n    You will build a comprehensive preparatory survey of which facts we have at our disposal and which ones we still need.\n    These \"facts\" will typically be specific names, dates, values, etc. Your answer should use the below headings:\n    ### 1.1. Facts given in the task\n    List here the specific facts given in the task that could help you (there might be nothing here).\n\n    ### 1.2. Facts to look up\n    List here any facts that we may need to look up.\n    Also list where to find each of these, for instance a website, a file... - maybe the task contains some sources that you should re-use here.\n\n    ### 1.3. Facts to derive\n    List here anything that we want to derive from the above by logical reasoning, for instance computation or simulation.\n\n    Don't make any assumptions. For each item, provide a thorough reasoning. Do not add anything else on top of three headings above.\n\n    ## 2. Plan\n    Then for the given task, develop a step-by-step high-level plan taking into account the above inputs and list of facts.\n    This plan should involve individual tasks based on the available tools, that if executed correctly will yield the correct answer.\n    Do not skip steps, do not add any superfluous steps. Only write the high-level plan, DO NOT DETAIL INDIVIDUAL TOOL CALLS.\n    After writing the final step of the plan, write the '<end_plan>' tag and stop there.\n\n    You can leverage these tools, behaving like regular python functions:\n    ```python\n    {%- for tool in tools.values() %}\n    {{ tool.to_code_prompt() }}\n    {% endfor %}\n    ```\n\n    {%- if managed_agents and managed_agents.values() | list %}\n    You can also give tasks to team members.\n    Calling a team member works similarly to calling a tool: provide the task description as the 'task' argument. Since this team member is a real human, be as detailed and verbose as necessary in your task description.\n    You can also include any relevant variables or context using the 'additional_args' argument.\n    Here is a list of the team members that you can call:\n    ```python\n    {%- for agent in managed_agents.values() %}\n    def {{ agent.name }}(task: str, additional_args: dict[str, Any]) -> str:\n        \"\"\"{{ agent.description }}\n\n        Args:\n            task: Long detailed description of the task.\n            additional_args: Dictionary of extra inputs to pass to the managed agent, e.g. images, dataframes, or any other contextual data it may need.\n        \"\"\"\n    {% endfor %}\n    ```\n    {%- endif %}\n\n    ---\n    Now begin! Here is your task:\n    ```\n    {{task}}\n    ```\n    First in part 1, write the facts survey, then in part 2, write your plan.\n  update_plan_pre_messages: |-\n    You are a world expert at analyzing a situation, and plan accordingly towards solving a task.\n    You have been given the following task:\n    ```\n    {{task}}\n    ```\n\n    Below you will find a history of attempts made to solve this task.\n    You will first have to produce a survey of known and unknown facts, then propose a step-by-step high-level plan to solve the task.\n    If the previous tries so far have met some success, your updated plan can build on these results.\n    If you are stalled, you can make a completely new plan starting from scratch.\n\n    Find the task and history below:\n  update_plan_post_messages: |-\n    Now write your updated facts below, taking into account the above history:\n    ## 1. Updated facts survey\n    ### 1.1. Facts given in the task\n    ### 1.2. Facts that we have learned\n    ### 1.3. Facts still to look up\n    ### 1.4. Facts still to derive\n\n    Then write a step-by-step high-level plan to solve the task above.\n    ## 2. Plan\n    ### 2. 1. ...\n    Etc.\n    This plan should involve individual tasks based on the available tools, that if executed correctly will yield the correct answer.\n    Beware that you have {remaining_steps} steps remaining.\n    Do not skip steps, do not add any superfluous steps. Only write the high-level plan, DO NOT DETAIL INDIVIDUAL TOOL CALLS.\n    After writing the final step of the plan, write the '<end_plan>' tag and stop there.\n\n    You can leverage these tools, behaving like regular python functions:\n    ```python\n    {%- for tool in tools.values() %}\n    {{ tool.to_code_prompt() }}\n    {% endfor %}\n    ```\n\n    {%- if managed_agents and managed_agents.values() | list %}\n    You can also give tasks to team members.\n    Calling a team member works similarly to calling a tool: provide the task description as the 'task' argument. Since this team member is a real human, be as detailed and verbose as necessary in your task description.\n    You can also include any relevant variables or context using the 'additional_args' argument.\n    Here is a list of the team members that you can call:\n    ```python\n    {%- for agent in managed_agents.values() %}\n    def {{ agent.name }}(task: str, additional_args: dict[str, Any]) -> str:\n        \"\"\"{{ agent.description }}\n\n        Args:\n            task: Long detailed description of the task.\n            additional_args: Dictionary of extra inputs to pass to the managed agent, e.g. images, dataframes, or any other contextual data it may need.\n        \"\"\"\n    {% endfor %}\n    ```\n    {%- endif %}\n\n    Now write your updated facts survey below, then your new plan.\nmanaged_agent:\n  task: |-\n      You're a helpful agent named '{{name}}'.\n      You have been submitted this task by your manager.\n      ---\n      Task:\n      {{task}}\n      ---\n      You're helping your manager solve a wider task: so make sure to not provide a one-line answer, but give as much information as possible to give them a clear understanding of the answer.\n\n      Your final_answer WILL HAVE to contain these parts:\n      ### 1. Task outcome (short version):\n      ### 2. Task outcome (extremely detailed version):\n      ### 3. Additional context (if relevant):\n\n      Put all these in your final_answer tool, everything that you do not pass as an argument to final_answer will be lost.\n      And even if your task resolution is not successful, please return as much context as possible, so that your manager can act upon this feedback.\n  report: |-\n      Here is the final answer from your managed agent '{{name}}':\n      {{final_answer}}\nfinal_answer:\n  pre_messages: |-\n    An agent tried to answer a user query but it got stuck and failed to do so. You are tasked with providing an answer instead. Here is the agent's memory:\n  post_messages: |-\n    Based on the above, please provide an answer to the following user task:\n    {{task}}\n"
  },
  {
    "path": "src/smolagents/prompts/structured_code_agent.yaml",
    "content": "system_prompt: |-\n  You are an expert assistant who can solve any task using code blobs. You will be given a task to solve as best you can.\n  To do so, you have been given access to a list of tools: these tools are basically Python functions which you can call with code.\n  To solve the task, you must plan forward to proceed in a series of steps, in a cycle of 'Thought:', 'Code:', and 'Observation:' sequences.\n\n  At each step, in the 'Thought:' attribute, you should first explain your reasoning towards solving the task and the tools that you want to use.\n  Then in the 'Code' attribute, you should write the code in simple Python.\n  During each intermediate step, you can use 'print()' to save whatever important information you will then need.\n  These print outputs will then appear in the 'Observation:' field, which will be available as input for the next step.\n  In the end you have to return a final answer using the `final_answer` tool. You will be generating a JSON object with the following structure:\n  ```json\n  {\n    \"thought\": \"...\",\n    \"code\": \"...\"\n  }\n  ```\n\n  Here are a few examples using notional tools:\n  ---\n  Task: \"Generate an image of the oldest person in this document.\"\n\n  {\"thought\": \"I will proceed step by step and use the following tools: `document_qa` to find the oldest person in the document, then `image_generator` to generate an image according to the answer.\", \"code\": \"answer = document_qa(document=document, question=\\\"Who is the oldest person mentioned?\\\")\\nprint(answer)\\n\"}\n  Observation: \"The oldest person in the document is John Doe, a 55 year old lumberjack living in Newfoundland.\"\n\n  {\"thought\": \"I will now generate an image showcasing the oldest person.\", \"code\": \"image = image_generator(\\\"A portrait of John Doe, a 55-year-old man living in Canada.\\\")\\nfinal_answer(image)\\n\"}\n  ---\n  Task: \"What is the result of the following operation: 5 + 3 + 1294.678?\"\n\n  {\"thought\": \"I will use python code to compute the result of the operation and then return the final answer using the `final_answer` tool\", \"code\": \"result = 5 + 3 + 1294.678\\nfinal_answer(result)\\n\"}\n\n  ---\n  Task:\n  In a 1979 interview, Stanislaus Ulam discusses with Martin Sherwin about other great physicists of his time, including Oppenheimer.\n  What does he say was the consequence of Einstein learning too much math on his creativity, in one word?\n\n  {\"thought\": \"I need to find and read the 1979 interview of Stanislaus Ulam with Martin Sherwin.\", \"code\": \"pages = web_search(query=\\\"1979 interview Stanislaus Ulam Martin Sherwin physicists Einstein\\\")\\nprint(pages)\\n\"}\n  Observation:\n  No result found for query \"1979 interview Stanislaus Ulam Martin Sherwin physicists Einstein\".\n\n  {\"thought\": \"The query was maybe too restrictive and did not find any results. Let's try again with a broader query.\", \"code\": \"pages = web_search(query=\\\"1979 interview Stanislaus Ulam\\\")\\nprint(pages)\\n\"}\n  Observation:\n  Found 6 pages:\n  [Stanislaus Ulam 1979 interview](https://ahf.nuclearmuseum.org/voices/oral-histories/stanislaus-ulams-interview-1979/)\n\n  [Ulam discusses Manhattan Project](https://ahf.nuclearmuseum.org/manhattan-project/ulam-manhattan-project/)\n\n  (truncated)\n\n  {\"thought\": \"I will read the first 2 pages to know more.\", \"code\": \"for url in [\\\"https://ahf.nuclearmuseum.org/voices/oral-histories/stanislaus-ulams-interview-1979/\\\", \\\"https://ahf.nuclearmuseum.org/manhattan-project/ulam-manhattan-project/\\\"]:\\n      whole_page = visit_webpage(url)\\n      print(whole_page)\\n      print(\\\"\\n\\\" + \\\"=\\\"*80 + \\\"\\n\\\")  # Print separator between pages\"}\n\n  Observation:\n  Manhattan Project Locations:\n  Los Alamos, NM\n  Stanislaus Ulam was a Polish-American mathematician. He worked on the Manhattan Project at Los Alamos and later helped design the hydrogen bomb. In this interview, he discusses his work at\n  (truncated)\n\n  {\"thought\": \"I now have the final answer: from the webpages visited, Stanislaus Ulam says of Einstein: \\\"He learned too much mathematics and sort of diminished, it seems to me personally, it seems to me his purely physics creativity.\\\" Let's answer in one word.\", \"code\": \"final_answer(\\\"diminished\\\")\"}\n\n  ---\n  Task: \"Which city has the highest population: Guangzhou or Shanghai?\"\n\n  {\"thought\": \"I need to get the populations for both cities and compare them: I will use the tool `web_search` to get the population of both cities.\", \"code\": \"for city in [\\\"Guangzhou\\\", \\\"Shanghai\\\"]:\\n      print(f\\\"Population {city}:\\\", web_search(f\\\"{city} population\\\")\"}\n  Observation:\n  Population Guangzhou: ['Guangzhou has a population of 15 million inhabitants as of 2021.']\n  Population Shanghai: '26 million (2019)'\n\n  {\"thought\": \"Now I know that Shanghai has the highest population.\", \"code\": \"final_answer(\\\"Shanghai\\\")\"}\n\n  ---\n  Task: \"What is the current age of the pope, raised to the power 0.36?\"\n\n  {\"thought\": \"I will use the tool `wikipedia_search` to get the age of the pope, and confirm that with a web search.\", \"code\": \"pope_age_wiki = wikipedia_search(query=\\\"current pope age\\\")\\nprint(\\\"Pope age as per wikipedia:\\\", pope_age_wiki)\\npope_age_search = web_search(query=\\\"current pope age\\\")\\nprint(\\\"Pope age as per google search:\\\", pope_age_search)\"}\n  Observation:\n  Pope age: \"The pope Francis is currently 88 years old.\"\n\n  {\"thought\": \"I know that the pope is 88 years old. Let's compute the result using python code.\", \"code\": \"pope_current_age = 88 ** 0.36\\nfinal_answer(pope_current_age)\"}\n\n  Above example were using notional tools that might not exist for you. On top of performing computations in the Python code snippets that you create, you only have access to these tools, behaving like regular python functions:\n  ```python\n  {%- for tool in tools.values() %}\n  {{ tool.to_code_prompt() }}\n  {% endfor %}\n  ```\n\n  {%- if managed_agents and managed_agents.values() | list %}\n  You can also give tasks to team members.\n  Calling a team member works similarly to calling a tool: provide the task description as the 'task' argument. Since this team member is a real human, be as detailed and verbose as necessary in your task description.\n  You can also include any relevant variables or context using the 'additional_args' argument.\n  Here is a list of the team members that you can call:\n  ```python\n  {%- for agent in managed_agents.values() %}\n  def {{ agent.name }}(task: str, additional_args: dict[str, Any]) -> str:\n      \"\"\"{{ agent.description }}\n\n      Args:\n          task: Long detailed description of the task.\n          additional_args: Dictionary of extra inputs to pass to the managed agent, e.g. images, dataframes, or any other contextual data it may need.\n      \"\"\"\n  {% endfor %}\n  ```\n  {%- endif %}\n\n  {%- if custom_instructions %}\n  {{custom_instructions}}\n  {%- endif %}\n\n  Here are the rules you should always follow to solve your task:\n  1. Use only variables that you have defined!\n  2. Always use the right arguments for the tools. DO NOT pass the arguments as a dict as in 'answer = wikipedia_search({'query': \"What is the place where James Bond lives?\"})', but use the arguments directly as in 'answer = wikipedia_search(query=\"What is the place where James Bond lives?\")'.\n  3. Take care to not chain too many sequential tool calls in the same code block, especially when the output format is unpredictable. For instance, a call to wikipedia_search has an unpredictable return format, so do not have another tool call that depends on its output in the same block: rather output results with print() to use them in the next block.\n  4. Call a tool only when needed, and never re-do a tool call that you previously did with the exact same parameters.\n  5. Don't name any new variable with the same name as a tool: for instance don't name a variable 'final_answer'.\n  6. Never create any notional variables in our code, as having these in your logs will derail you from the true variables.\n  7. You can use imports in your code, but only from the following list of modules: {{authorized_imports}}\n  8. The state persists between code executions: so if in one step you've created variables or imported modules, these will all persist.\n  9. Don't give up! You're in charge of solving the task, not providing directions to solve it.\n\n  Now Begin!\nplanning:\n  initial_plan: |-\n    You are a world expert at analyzing a situation to derive facts, and plan accordingly towards solving a task.\n    Below I will present you a task. You will need to 1. build a survey of facts known or needed to solve the task, then 2. make a plan of action to solve the task.\n\n    ## 1. Facts survey\n    You will build a comprehensive preparatory survey of which facts we have at our disposal and which ones we still need.\n    These \"facts\" will typically be specific names, dates, values, etc. Your answer should use the below headings:\n    ### 1.1. Facts given in the task\n    List here the specific facts given in the task that could help you (there might be nothing here).\n\n    ### 1.2. Facts to look up\n    List here any facts that we may need to look up.\n    Also list where to find each of these, for instance a website, a file... - maybe the task contains some sources that you should re-use here.\n\n    ### 1.3. Facts to derive\n    List here anything that we want to derive from the above by logical reasoning, for instance computation or simulation.\n\n    Don't make any assumptions. For each item, provide a thorough reasoning. Do not add anything else on top of three headings above.\n\n    ## 2. Plan\n    Then for the given task, develop a step-by-step high-level plan taking into account the above inputs and list of facts.\n    This plan should involve individual tasks based on the available tools, that if executed correctly will yield the correct answer.\n    Do not skip steps, do not add any superfluous steps. Only write the high-level plan, DO NOT DETAIL INDIVIDUAL TOOL CALLS.\n    After writing the final step of the plan, write the '<end_plan>' tag and stop there.\n\n    You can leverage these tools, behaving like regular python functions:\n    ```python\n    {%- for tool in tools.values() %}\n    {{ tool.to_code_prompt() }}\n    {% endfor %}\n    ```\n\n    {%- if managed_agents and managed_agents.values() | list %}\n    You can also give tasks to team members.\n    Calling a team member works similarly to calling a tool: provide the task description as the 'task' argument. Since this team member is a real human, be as detailed and verbose as necessary in your task description.\n    You can also include any relevant variables or context using the 'additional_args' argument.\n    Here is a list of the team members that you can call:\n    ```python\n    {%- for agent in managed_agents.values() %}\n    def {{ agent.name }}(task: str, additional_args: dict[str, Any]) -> str:\n        \"\"\"{{ agent.description }}\n\n        Args:\n            task: Long detailed description of the task.\n            additional_args: Dictionary of extra inputs to pass to the managed agent, e.g. images, dataframes, or any other contextual data it may need.\n        \"\"\"\n    {% endfor %}\n    ```\n    {%- endif %}\n\n    ---\n    Now begin! Here is your task:\n    ```\n    {{task}}\n    ```\n    First in part 1, write the facts survey, then in part 2, write your plan.\n  update_plan_pre_messages: |-\n    You are a world expert at analyzing a situation, and plan accordingly towards solving a task.\n    You have been given the following task:\n    ```\n    {{task}}\n    ```\n\n    Below you will find a history of attempts made to solve this task.\n    You will first have to produce a survey of known and unknown facts, then propose a step-by-step high-level plan to solve the task.\n    If the previous tries so far have met some success, your updated plan can build on these results.\n    If you are stalled, you can make a completely new plan starting from scratch.\n\n    Find the task and history below:\n  update_plan_post_messages: |-\n    Now write your updated facts below, taking into account the above history:\n    ## 1. Updated facts survey\n    ### 1.1. Facts given in the task\n    ### 1.2. Facts that we have learned\n    ### 1.3. Facts still to look up\n    ### 1.4. Facts still to derive\n\n    Then write a step-by-step high-level plan to solve the task above.\n    ## 2. Plan\n    ### 2. 1. ...\n    Etc.\n    This plan should involve individual tasks based on the available tools, that if executed correctly will yield the correct answer.\n    Beware that you have {remaining_steps} steps remaining.\n    Do not skip steps, do not add any superfluous steps. Only write the high-level plan, DO NOT DETAIL INDIVIDUAL TOOL CALLS.\n    After writing the final step of the plan, write the '<end_plan>' tag and stop there.\n\n    You can leverage these tools, behaving like regular python functions:\n    ```python\n    {%- for tool in tools.values() %}\n    {{ tool.to_code_prompt() }}\n    {% endfor %}\n    ```\n\n    {%- if managed_agents and managed_agents.values() | list %}\n    You can also give tasks to team members.\n    Calling a team member works similarly to calling a tool: provide the task description as the 'task' argument. Since this team member is a real human, be as detailed and verbose as necessary in your task description.\n    You can also include any relevant variables or context using the 'additional_args' argument.\n    Here is a list of the team members that you can call:\n    ```python\n    {%- for agent in managed_agents.values() %}\n    def {{ agent.name }}(task: str, additional_args: dict[str, Any]) -> str:\n        \"\"\"{{ agent.description }}\n\n        Args:\n            task: Long detailed description of the task.\n            additional_args: Dictionary of extra inputs to pass to the managed agent, e.g. images, dataframes, or any other contextual data it may need.\n        \"\"\"\n    {% endfor %}\n    ```\n    {%- endif %}\n\n    Now write your updated facts survey below, then your new plan.\nmanaged_agent:\n  task: |-\n    You're a helpful agent named '{{name}}'.\n    You have been submitted this task by your manager.\n    ---\n    Task:\n    {{task}}\n    ---\n    You're helping your manager solve a wider task: so make sure to not provide a one-line answer, but give as much information as possible to give them a clear understanding of the answer.\n\n    Your final_answer WILL HAVE to contain these parts:\n    ### 1. Task outcome (short version):\n    ### 2. Task outcome (extremely detailed version):\n    ### 3. Additional context (if relevant):\n\n    Put all these in your final_answer tool, everything that you do not pass as an argument to final_answer will be lost.\n    And even if your task resolution is not successful, please return as much context as possible, so that your manager can act upon this feedback.\n  report: |-\n    Here is the final answer from your managed agent '{{name}}':\n    {{final_answer}}\nfinal_answer:\n  pre_messages: |-\n    An agent tried to answer a user query but it got stuck and failed to do so. You are tasked with providing an answer instead. Here is the agent's memory:\n  post_messages: |-\n    Based on the above, please provide an answer to the following user task:\n    {{task}}\n"
  },
  {
    "path": "src/smolagents/prompts/toolcalling_agent.yaml",
    "content": "system_prompt: |-\n  You are an expert assistant who can solve any task using tool calls. You will be given a task to solve as best you can.\n  To do so, you have been given access to some tools.\n\n  The tool call you write is an action: after the tool is executed, you will get the result of the tool call as an \"observation\".\n  This Action/Observation can repeat N times, you should take several steps when needed.\n\n  You can use the result of the previous action as input for the next action.\n  The observation will always be a string: it can represent a file, like \"image_1.jpg\".\n  Then you can use it as input for the next action. You can do it for instance as follows:\n\n  Observation: \"image_1.jpg\"\n\n  Action:\n  {\n    \"name\": \"image_transformer\",\n    \"arguments\": {\"image\": \"image_1.jpg\"}\n  }\n\n  To provide the final answer to the task, use an action blob with \"name\": \"final_answer\" tool. It is the only way to complete the task, else you will be stuck on a loop. So your final output should look like this:\n  Action:\n  {\n    \"name\": \"final_answer\",\n    \"arguments\": {\"answer\": \"insert your final answer here\"}\n  }\n\n\n  Here are a few examples using notional tools:\n  ---\n  Task: \"Generate an image of the oldest person in this document.\"\n\n  Action:\n  {\n    \"name\": \"document_qa\",\n    \"arguments\": {\"document\": \"document.pdf\", \"question\": \"Who is the oldest person mentioned?\"}\n  }\n  Observation: \"The oldest person in the document is John Doe, a 55 year old lumberjack living in Newfoundland.\"\n\n  Action:\n  {\n    \"name\": \"image_generator\",\n    \"arguments\": {\"prompt\": \"A portrait of John Doe, a 55-year-old man living in Canada.\"}\n  }\n  Observation: \"image.png\"\n\n  Action:\n  {\n    \"name\": \"final_answer\",\n    \"arguments\": \"image.png\"\n  }\n\n  ---\n  Task: \"What is the result of the following operation: 5 + 3 + 1294.678?\"\n\n  Action:\n  {\n      \"name\": \"python_interpreter\",\n      \"arguments\": {\"code\": \"5 + 3 + 1294.678\"}\n  }\n  Observation: 1302.678\n\n  Action:\n  {\n    \"name\": \"final_answer\",\n    \"arguments\": \"1302.678\"\n  }\n\n  ---\n  Task: \"Which city has the highest population , Guangzhou or Shanghai?\"\n\n  Action:\n  {\n      \"name\": \"web_search\",\n      \"arguments\": \"Population Guangzhou\"\n  }\n  Observation: ['Guangzhou has a population of 15 million inhabitants as of 2021.']\n\n\n  Action:\n  {\n      \"name\": \"web_search\",\n      \"arguments\": \"Population Shanghai\"\n  }\n  Observation: '26 million (2019)'\n\n  Action:\n  {\n    \"name\": \"final_answer\",\n    \"arguments\": \"Shanghai\"\n  }\n\n  Above example were using notional tools that might not exist for you. You only have access to these tools:\n  {%- for tool in tools.values() %}\n  - {{ tool.to_tool_calling_prompt() }}\n  {%- endfor %}\n\n  {%- if managed_agents and managed_agents.values() | list %}\n  You can also give tasks to team members.\n  Calling a team member works similarly to calling a tool: provide the task description as the 'task' argument. Since this team member is a real human, be as detailed and verbose as necessary in your task description.\n  You can also include any relevant variables or context using the 'additional_args' argument.\n  Here is a list of the team members that you can call:\n  {%- for agent in managed_agents.values() %}\n  - {{ agent.name }}: {{ agent.description }}\n    - Takes inputs: {{agent.inputs}}\n    - Returns an output of type: {{agent.output_type}}\n  {%- endfor %}\n  {%- endif %}\n\n  {%- if custom_instructions %}\n  {{custom_instructions}}\n  {%- endif %}\n\n  Here are the rules you should always follow to solve your task:\n  1. ALWAYS provide a tool call, else you will fail.\n  2. Always use the right arguments for the tools. Never use variable names as the action arguments, use the value instead.\n  3. Call a tool only when needed: do not call the search agent if you do not need information, try to solve the task yourself. If no tool call is needed, use final_answer tool to return your answer.\n  4. Never re-do a tool call that you previously did with the exact same parameters.\n\n  Now Begin!\nplanning:\n  initial_plan : |-\n    You are a world expert at analyzing a situation to derive facts, and plan accordingly towards solving a task.\n    Below I will present you a task. You will need to 1. build a survey of facts known or needed to solve the task, then 2. make a plan of action to solve the task.\n\n    ## 1. Facts survey\n    You will build a comprehensive preparatory survey of which facts we have at our disposal and which ones we still need.\n    These \"facts\" will typically be specific names, dates, values, etc. Your answer should use the below headings:\n    ### 1.1. Facts given in the task\n    List here the specific facts given in the task that could help you (there might be nothing here).\n\n    ### 1.2. Facts to look up\n    List here any facts that we may need to look up.\n    Also list where to find each of these, for instance a website, a file... - maybe the task contains some sources that you should re-use here.\n\n    ### 1.3. Facts to derive\n    List here anything that we want to derive from the above by logical reasoning, for instance computation or simulation.\n\n    Don't make any assumptions. For each item, provide a thorough reasoning. Do not add anything else on top of three headings above.\n\n    ## 2. Plan\n    Then for the given task, develop a step-by-step high-level plan taking into account the above inputs and list of facts.\n    This plan should involve individual tasks based on the available tools, that if executed correctly will yield the correct answer.\n    Do not skip steps, do not add any superfluous steps. Only write the high-level plan, DO NOT DETAIL INDIVIDUAL TOOL CALLS.\n    After writing the final step of the plan, write the '<end_plan>' tag and stop there.\n\n    You can leverage these tools:\n    {%- for tool in tools.values() %}\n    - {{ tool.to_tool_calling_prompt() }}\n    {%- endfor %}\n\n    {%- if managed_agents and managed_agents.values() | list %}\n    You can also give tasks to team members.\n    Calling a team member works similarly to calling a tool: provide the task description as the 'task' argument. Since this team member is a real human, be as detailed and verbose as necessary in your task description.\n    You can also include any relevant variables or context using the 'additional_args' argument.\n    Here is a list of the team members that you can call:\n    {%- for agent in managed_agents.values() %}\n    - {{ agent.name }}: {{ agent.description }}\n      - Takes inputs: {{agent.inputs}}\n      - Returns an output of type: {{agent.output_type}}\n    {%- endfor %}\n    {%- endif %}\n\n    ---\n    Now begin! Here is your task:\n    ```\n    {{task}}\n    ```\n    First in part 1, write the facts survey, then in part 2, write your plan.\n  update_plan_pre_messages: |-\n    You are a world expert at analyzing a situation, and plan accordingly towards solving a task.\n    You have been given the following task:\n    ```\n    {{task}}\n    ```\n  \n    Below you will find a history of attempts made to solve this task.\n    You will first have to produce a survey of known and unknown facts, then propose a step-by-step high-level plan to solve the task.\n    If the previous tries so far have met some success, your updated plan can build on these results.\n    If you are stalled, you can make a completely new plan starting from scratch.\n\n    Find the task and history below:\n  update_plan_post_messages: |-\n    Now write your updated facts below, taking into account the above history:\n    ## 1. Updated facts survey\n    ### 1.1. Facts given in the task\n    ### 1.2. Facts that we have learned\n    ### 1.3. Facts still to look up\n    ### 1.4. Facts still to derive\n  \n    Then write a step-by-step high-level plan to solve the task above.\n    ## 2. Plan\n    ### 2. 1. ...\n    Etc.\n    This plan should involve individual tasks based on the available tools, that if executed correctly will yield the correct answer.\n    Beware that you have {remaining_steps} steps remaining.\n    Do not skip steps, do not add any superfluous steps. Only write the high-level plan, DO NOT DETAIL INDIVIDUAL TOOL CALLS.\n    After writing the final step of the plan, write the '<end_plan>' tag and stop there.\n\n    You can leverage these tools:\n    {%- for tool in tools.values() %}\n    - {{ tool.to_tool_calling_prompt() }}\n    {%- endfor %}\n\n    {%- if managed_agents and managed_agents.values() | list %}\n    You can also give tasks to team members.\n    Calling a team member works similarly to calling a tool: provide the task description as the 'task' argument. Since this team member is a real human, be as detailed and verbose as necessary in your task description.\n    You can also include any relevant variables or context using the 'additional_args' argument.\n    Here is a list of the team members that you can call:\n    {%- for agent in managed_agents.values() %}\n    - {{ agent.name }}: {{ agent.description }}\n      - Takes inputs: {{agent.inputs}}\n      - Returns an output of type: {{agent.output_type}}\n    {%- endfor %}\n    {%- endif %}\n\n    Now write your new plan below.\nmanaged_agent:\n  task: |-\n      You're a helpful agent named '{{name}}'.\n      You have been submitted this task by your manager.\n      ---\n      Task:\n      {{task}}\n      ---\n      You're helping your manager solve a wider task: so make sure to not provide a one-line answer, but give as much information as possible to give them a clear understanding of the answer.\n\n      Your final_answer WILL HAVE to contain these parts:\n      ### 1. Task outcome (short version):\n      ### 2. Task outcome (extremely detailed version):\n      ### 3. Additional context (if relevant):\n\n      Put all these in your final_answer tool, everything that you do not pass as an argument to final_answer will be lost.\n      And even if your task resolution is not successful, please return as much context as possible, so that your manager can act upon this feedback.\n  report: |-\n      Here is the final answer from your managed agent '{{name}}':\n      {{final_answer}}\nfinal_answer:\n  pre_messages: |-\n    An agent tried to answer a user query but it got stuck and failed to do so. You are tasked with providing an answer instead. Here is the agent's memory:\n  post_messages: |-\n    Based on the above, please provide an answer to the following user task:\n    {{task}}\n"
  },
  {
    "path": "src/smolagents/remote_executors.py",
    "content": "#!/usr/bin/env python\n# coding=utf-8\n\n# Copyright 2024 The HuggingFace Inc. team. All rights reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\nimport base64\nimport inspect\nimport json\nimport os\nimport pickle\nimport re\nimport secrets\nimport subprocess\nimport tempfile\nimport time\nimport uuid\nfrom contextlib import closing\nfrom io import BytesIO\nfrom textwrap import dedent\nfrom typing import Any, Optional\n\nimport PIL.Image\nimport requests\nfrom requests.exceptions import RequestException\n\nfrom .default_tools import FinalAnswerTool\nfrom .local_python_executor import CodeOutput, PythonExecutor\nfrom .monitoring import LogLevel\nfrom .serialization import SafeSerializer, SerializationError\nfrom .tools import Tool, get_tools_definition_code\nfrom .utils import AgentError\n\n\n__all__ = [\"BlaxelExecutor\", \"E2BExecutor\", \"ModalExecutor\", \"DockerExecutor\", \"WasmExecutor\"]\n\n\ntry:\n    from dotenv import load_dotenv\n\n    load_dotenv()\nexcept ModuleNotFoundError:\n    pass\n\n\nclass RemotePythonExecutor(PythonExecutor):\n    \"\"\"\n    Executor of Python code in a remote environment.\n\n    Args:\n        additional_imports (`list[str]`): Additional Python packages to install.\n        logger (`Logger`): Logger to use for output and errors.\n        allow_pickle (`bool`, default `False`): Whether to allow pickle serialization for objects that cannot be safely serialized to JSON.\n            - `False` (default, recommended): Only safe JSON serialization is used. Raises error if object cannot be safely serialized.\n            - `True` (legacy mode): Tries safe JSON serialization first, falls back to pickle with warning if needed.\n\n            **Security Warning:** Pickle deserialization can execute arbitrary code. Only set `allow_pickle=True`\n            if you fully trust the execution environment and need backward compatibility with custom types.\n    \"\"\"\n\n    FINAL_ANSWER_EXCEPTION = \"FinalAnswerException\"\n\n    def __init__(\n        self,\n        additional_imports: list[str],\n        logger,\n        allow_pickle: bool = False,\n    ):\n        self.additional_imports = additional_imports\n        self.logger = logger\n        self.allow_pickle = allow_pickle\n        self.logger.log(\"Initializing executor, hold on...\")\n        self.installed_packages = []\n\n    def run_code_raise_errors(self, code: str) -> CodeOutput:\n        \"\"\"\n        Execute Python code in the remote environment and return the result.\n\n        Args:\n            code (`str`): Python code to execute.\n\n        Returns:\n            `CodeOutput`: Code output containing the result, logs, and whether it is the final answer.\n        \"\"\"\n        raise NotImplementedError\n\n    def send_tools(self, tools: dict[str, Tool]):\n        if \"final_answer\" in tools:\n            self._patch_final_answer_with_exception(tools[\"final_answer\"])\n        # Install tool packages\n        packages_to_install = {\n            pkg\n            for tool in tools.values()\n            for pkg in tool.to_dict()[\"requirements\"]\n            if pkg not in self.installed_packages + [\"smolagents\"]\n        }\n        if \"PIL\" in packages_to_install:\n            packages_to_install.discard(\"PIL\")\n            packages_to_install.add(\"pillow\")\n        if packages_to_install:\n            self.installed_packages += self.install_packages(list(packages_to_install))\n        # Get tool definitions\n        code = get_tools_definition_code(tools)\n        if code:\n            code_output = self.run_code_raise_errors(code)\n            self.logger.log(code_output.logs)\n\n    def send_variables(self, variables: dict[str, Any]):\n        \"\"\"Send variables to the kernel namespace using SafeSerializer.\n\n        Uses prefix-based format (\"safe:...\" or \"pickle:...\").\n        When allow_pickle=False, only safe JSON serialization is allowed.\n        When allow_pickle=True, pickle fallback is enabled for complex types.\n        \"\"\"\n        if not variables:\n            return\n\n        serialized = SafeSerializer.dumps(variables, allow_pickle=self.allow_pickle)\n        code = f\"\"\"\n{SafeSerializer.get_deserializer_code(self.allow_pickle)}\nvars_dict = _deserialize({repr(serialized)})\nlocals().update(vars_dict)\n\"\"\"\n        self.run_code_raise_errors(code)\n\n    def __call__(self, code_action: str) -> CodeOutput:\n        \"\"\"Run the code and determine if it is the final answer.\"\"\"\n        return self.run_code_raise_errors(code_action)\n\n    def install_packages(self, additional_imports: list[str]):\n        if additional_imports:\n            code_output = self.run_code_raise_errors(f\"!pip install {' '.join(additional_imports)}\")\n            self.logger.log(code_output.logs)\n        return additional_imports\n\n    def _patch_final_answer_with_exception(self, final_answer_tool: FinalAnswerTool):\n        \"\"\"Patch the FinalAnswerTool to raise an exception.\n\n        This is necessary because the remote executors\n        rely on the FinalAnswerTool to detect the final answer.\n        It modifies the `forward` method of the FinalAnswerTool to raise\n        a `FinalAnswerException` with the final answer as a serialized value.\n        This allows the executor to catch this exception and return the final answer.\n\n        Uses prefix-based format (\"safe:\" or \"pickle:\") for serialization.\n\n        Args:\n            final_answer_tool (`FinalAnswerTool`): FinalAnswerTool instance to patch.\n        \"\"\"\n\n        # Create a new class that inherits from the original FinalAnswerTool\n        class _FinalAnswerTool(final_answer_tool.__class__):\n            pass\n\n        # Add a new forward method that raises the FinalAnswerException\n        # NOTE: Serialization logic is inlined here because this method's source code\n        # is extracted and sent to remote environments where external references don't exist\n        # Capture settings via closure\n        allow_pickle_setting = self.allow_pickle\n\n        def forward(self, *args, **kwargs) -> Any:\n            import base64\n            import json\n            from io import BytesIO\n\n            # Baked in from closure at patch time\n            ALLOW_PICKLE = allow_pickle_setting\n\n            class SerializationError(Exception):\n                pass\n\n            def _to_json_safe(obj):\n                if isinstance(obj, (str, int, float, bool, type(None))):\n                    return obj\n                elif isinstance(obj, dict):\n                    # Check if all keys are strings (JSON-compatible)\n                    if all(isinstance(k, str) for k in obj.keys()):\n                        return {k: _to_json_safe(v) for k, v in obj.items()}\n                    else:\n                        return {\n                            \"__type__\": \"dict_with_complex_keys\",\n                            \"data\": [[_to_json_safe(k), _to_json_safe(v)] for k, v in obj.items()],\n                        }\n                elif isinstance(obj, list):\n                    return [_to_json_safe(item) for item in obj]\n                elif isinstance(obj, tuple):\n                    return {\"__type__\": \"tuple\", \"data\": [_to_json_safe(item) for item in obj]}\n                elif isinstance(obj, set):\n                    return {\"__type__\": \"set\", \"data\": [_to_json_safe(item) for item in obj]}\n                elif isinstance(obj, bytes):\n                    return {\"__type__\": \"bytes\", \"data\": base64.b64encode(obj).decode()}\n                elif isinstance(obj, complex):\n                    return {\"__type__\": \"complex\", \"real\": obj.real, \"imag\": obj.imag}\n                elif isinstance(obj, frozenset):\n                    return {\"__type__\": \"frozenset\", \"data\": [_to_json_safe(item) for item in obj]}\n\n                # Try PIL Image\n                try:\n                    import PIL.Image\n\n                    if isinstance(obj, PIL.Image.Image):\n                        buffer = BytesIO()\n                        obj.save(buffer, format=\"PNG\")\n                        return {\"__type__\": \"PIL.Image\", \"data\": base64.b64encode(buffer.getvalue()).decode()}\n                except ImportError:\n                    pass\n\n                # Lazy imports for less common types\n                from datetime import date, datetime, time, timedelta\n                from decimal import Decimal\n                from pathlib import Path\n\n                if isinstance(obj, datetime):\n                    return {\"__type__\": \"datetime\", \"data\": obj.isoformat()}\n                elif isinstance(obj, date):\n                    return {\"__type__\": \"date\", \"data\": obj.isoformat()}\n                elif isinstance(obj, time):\n                    return {\"__type__\": \"time\", \"data\": obj.isoformat()}\n                elif isinstance(obj, timedelta):\n                    return {\"__type__\": \"timedelta\", \"total_seconds\": obj.total_seconds()}\n                elif isinstance(obj, Decimal):\n                    return {\"__type__\": \"Decimal\", \"data\": str(obj)}\n                elif isinstance(obj, Path):\n                    return {\"__type__\": \"Path\", \"data\": str(obj)}\n\n                # Try numpy if available\n                try:\n                    import numpy as np\n\n                    if isinstance(obj, np.ndarray):\n                        return {\"__type__\": \"ndarray\", \"data\": obj.tolist(), \"dtype\": str(obj.dtype)}\n                    elif isinstance(obj, (np.integer, np.floating)):\n                        return obj.item()\n                except ImportError:\n                    pass\n\n                # Try dataclass\n                import dataclasses\n\n                if dataclasses.is_dataclass(obj) and not isinstance(obj, type):\n                    return {\n                        \"__type__\": \"dataclass\",\n                        \"class_name\": type(obj).__name__,\n                        \"module\": type(obj).__module__,\n                        \"data\": {f.name: _to_json_safe(getattr(obj, f.name)) for f in dataclasses.fields(obj)},\n                    }\n\n                # Cannot safely serialize - raise error for safe mode\n                raise SerializationError(f\"Cannot safely serialize object of type {type(obj).__name__}\")\n\n            def _serialize_with_fallback(obj):\n                \"\"\"Serialize with safe method, fallback to pickle if allowed.\"\"\"\n                import pickle\n\n                if not ALLOW_PICKLE:\n                    # Safe ONLY mode - NO pickle fallback, raise error if can't serialize\n                    json_safe = _to_json_safe(obj)  # Will raise SerializationError if fails\n                    return \"safe:\" + json.dumps(json_safe)\n                else:\n                    # Try safe first, fallback to pickle if allowed\n                    try:\n                        json_safe = _to_json_safe(obj)\n                        return \"safe:\" + json.dumps(json_safe)\n                    except SerializationError:\n                        # Fallback to pickle\n                        try:\n                            return \"pickle:\" + base64.b64encode(pickle.dumps(obj)).decode()\n                        except (pickle.PicklingError, TypeError, AttributeError):\n                            # Last resort: string representation\n                            return \"safe:\" + json.dumps(str(obj))\n\n            class FinalAnswerException(BaseException):\n                def __init__(self, value):\n                    self.value = value\n\n            raise FinalAnswerException(_serialize_with_fallback(self._forward(*args, **kwargs)))\n\n        # - Set the new forward method function to the _FinalAnswerTool class\n        _FinalAnswerTool.forward = forward\n\n        # Set __source__ with the actual values baked in (closures don't survive source extraction)\n        source = inspect.getsource(forward)\n        source = source.replace(\"ALLOW_PICKLE = allow_pickle_setting\", f\"ALLOW_PICKLE = {allow_pickle_setting}\")\n        forward.__source__ = source\n\n        # Rename the original forward method to _forward\n        # - Get the original forward method function from the final_answer_tool instance\n        original_forward_function = final_answer_tool.forward.__func__\n        # - Set the new _forward method function to the _FinalAnswerTool class\n        _FinalAnswerTool._forward = original_forward_function\n        # - Update the source code of the new forward method to match the original but with the new name\n        _FinalAnswerTool._forward.__source__ = inspect.getsource(original_forward_function).replace(\n            \"def forward(\", \"def _forward(\"\n        )\n\n        # Set the new class as the class of the final_answer_tool instance\n        final_answer_tool.__class__ = _FinalAnswerTool\n\n    @staticmethod\n    def _deserialize_final_answer(encoded_value: str, allow_pickle: bool = True) -> Any:\n        \"\"\"Deserialize final answer with format detection.\n\n        Accepts explicit prefix-based formats only:\n        - \"safe:\" for JSON-safe payloads\n        - \"pickle:\" for pickle payloads (only when allow_pickle=True)\n\n        Args:\n            encoded_value: Serialized string from FinalAnswerException.\n            allow_pickle: Whether to allow pickle deserialization.\n\n        Returns:\n            Deserialized Python object.\n\n        Raises:\n            SerializationError: If pickle data is rejected.\n        \"\"\"\n        if encoded_value.startswith(\"safe:\"):\n            json_data = json.loads(encoded_value[5:])\n            return SafeSerializer.from_json_safe(json_data)\n        elif encoded_value.startswith(\"pickle:\"):\n            if not allow_pickle:\n                raise SerializationError(\"Pickle data rejected: allow_pickle=False\")\n            return pickle.loads(base64.b64decode(encoded_value[7:]))\n        else:\n            raise SerializationError(\"Unknown final answer format: expected 'safe:' or 'pickle:' prefix\")\n\n\nclass E2BExecutor(RemotePythonExecutor):\n    \"\"\"\n    Remote Python code executor in an E2B sandbox.\n\n    Args:\n        additional_imports (`list[str]`): Additional Python packages to install.\n        logger (`Logger`): Logger to use for output and errors.\n        allow_pickle (`bool`, default `False`): Whether to allow pickle serialization for objects that cannot be safely serialized to JSON.\n            - `False` (default, recommended): Only safe JSON serialization is used. Raises error if object cannot be safely serialized.\n            - `True` (legacy mode): Tries safe JSON serialization first, falls back to pickle with warning if needed.\n\n            **Security Warning:** Pickle deserialization can execute arbitrary code. Only set `allow_pickle=True`\n            if you fully trust the execution environment and need backward compatibility with custom types.\n        **kwargs: Additional keyword arguments to pass to the E2B Sandbox instantiation.\n    \"\"\"\n\n    def __init__(\n        self,\n        additional_imports: list[str],\n        logger,\n        allow_pickle: bool = False,\n        **kwargs,\n    ):\n        super().__init__(additional_imports, logger, allow_pickle)\n        try:\n            from e2b_code_interpreter import Sandbox\n        except ModuleNotFoundError:\n            raise ModuleNotFoundError(\n                \"\"\"Please install 'e2b' extra to use E2BExecutor: `pip install 'smolagents[e2b]'`\"\"\"\n            )\n        # Support both e2b v1 and v2 constructors\n        # v2 exposes Sandbox.create(...), while v1 uses Sandbox(...)\n        if hasattr(Sandbox, \"create\"):\n            self.sandbox = Sandbox.create(**kwargs)\n        else:\n            self.sandbox = Sandbox(**kwargs)\n        self.installed_packages = self.install_packages(additional_imports)\n        self.logger.log(\"E2B is running\", level=LogLevel.INFO)\n\n    def run_code_raise_errors(self, code: str) -> CodeOutput:\n        \"\"\"\n        Execute Python code in the E2B sandbox and return the result.\n\n        Args:\n            code (`str`): Python code to execute.\n\n        Returns:\n            `CodeOutput`: Code output containing the result, logs, and whether it is the final answer.\n        \"\"\"\n        execution = self.sandbox.run_code(code)\n        execution_logs = \"\\n\".join([str(log) for log in execution.logs.stdout])\n\n        # Handle errors\n        if execution.error:\n            # Check if the error is a FinalAnswerException\n            if execution.error.name == RemotePythonExecutor.FINAL_ANSWER_EXCEPTION:\n                final_answer = self._deserialize_final_answer(execution.error.value, self.allow_pickle)\n                return CodeOutput(output=final_answer, logs=execution_logs, is_final_answer=True)\n\n            # Construct error message\n            error_message = (\n                f\"{execution_logs}\\n\"\n                f\"Executing code yielded an error:\\n\"\n                f\"{execution.error.name}\\n\"\n                f\"{execution.error.value}\\n\"\n                f\"{execution.error.traceback}\"\n            )\n            raise AgentError(error_message, self.logger)\n\n        # Handle results\n        if not execution.results:\n            return CodeOutput(output=None, logs=execution_logs, is_final_answer=False)\n\n        for result in execution.results:\n            if not result.is_main_result:\n                continue\n            # Handle image outputs\n            for attribute_name in [\"jpeg\", \"png\"]:\n                img_data = getattr(result, attribute_name, None)\n                if img_data is not None:\n                    decoded_bytes = base64.b64decode(img_data.encode(\"utf-8\"))\n                    return CodeOutput(\n                        output=PIL.Image.open(BytesIO(decoded_bytes)), logs=execution_logs, is_final_answer=False\n                    )\n            # Handle other data formats\n            for attribute_name in [\n                \"chart\",\n                \"data\",\n                \"html\",\n                \"javascript\",\n                \"json\",\n                \"latex\",\n                \"markdown\",\n                \"pdf\",\n                \"svg\",\n                \"text\",\n            ]:\n                data = getattr(result, attribute_name, None)\n                if data is not None:\n                    return CodeOutput(output=data, logs=execution_logs, is_final_answer=False)\n        # If no main result found, return None\n        return CodeOutput(output=None, logs=execution_logs, is_final_answer=False)\n\n    def cleanup(self):\n        \"\"\"Clean up the E2B sandbox and resources.\"\"\"\n        try:\n            if hasattr(self, \"sandbox\"):\n                self.logger.log(\"Shutting down sandbox...\", level=LogLevel.INFO)\n                self.sandbox.kill()\n                self.logger.log(\"Sandbox cleanup completed\", level=LogLevel.INFO)\n                del self.sandbox\n        except Exception as e:\n            self.logger.log_error(f\"Error during cleanup: {e}\")\n\n\ndef _websocket_send_execute_request(code: str, ws) -> str:\n    \"\"\"Send code execution request to kernel.\"\"\"\n    import uuid\n\n    # Generate a unique message ID\n    msg_id = str(uuid.uuid4())\n\n    # Create execute request\n    execute_request = {\n        \"header\": {\n            \"msg_id\": msg_id,\n            \"username\": \"anonymous\",\n            \"session\": str(uuid.uuid4()),\n            \"msg_type\": \"execute_request\",\n            \"version\": \"5.0\",\n        },\n        \"parent_header\": {},\n        \"metadata\": {},\n        \"content\": {\n            \"code\": code,\n            \"silent\": False,\n            \"store_history\": True,\n            \"user_expressions\": {},\n            \"allow_stdin\": False,\n        },\n    }\n\n    ws.send(json.dumps(execute_request))\n    return msg_id\n\n\ndef _websocket_run_code_raise_errors(\n    code: str, ws, logger, allow_pickle: bool = True, safe_serialization: bool = False\n) -> CodeOutput:\n    \"\"\"Run code over a websocket.\"\"\"\n    try:\n        # Send execute request\n        msg_id = _websocket_send_execute_request(code, ws)\n\n        # Collect output and results\n        outputs = []\n        result = None\n        is_final_answer = False\n\n        while True:\n            msg = json.loads(ws.recv())\n            parent_msg_id = msg.get(\"parent_header\", {}).get(\"msg_id\")\n            # Skip unrelated messages\n            if parent_msg_id != msg_id:\n                continue\n            msg_type = msg.get(\"msg_type\", \"\")\n            msg_content = msg.get(\"content\", {})\n            if msg_type == \"stream\":\n                outputs.append(msg_content[\"text\"])\n            elif msg_type == \"execute_result\":\n                result = msg_content[\"data\"].get(\"text/plain\", None)\n            elif msg_type == \"error\":\n                if msg_content.get(\"ename\", \"\") == RemotePythonExecutor.FINAL_ANSWER_EXCEPTION:\n                    result = RemotePythonExecutor._deserialize_final_answer(\n                        msg_content.get(\"evalue\", \"\"), allow_pickle\n                    )\n                    is_final_answer = True\n                else:\n                    raise AgentError(\"\\n\".join(msg_content.get(\"traceback\", [])), logger)\n            elif msg_type == \"status\" and msg_content[\"execution_state\"] == \"idle\":\n                break\n\n        return CodeOutput(output=result, logs=\"\".join(outputs), is_final_answer=is_final_answer)\n\n    except Exception as e:\n        logger.log_error(f\"Code execution failed: {e}\")\n        raise\n\n\ndef _create_kernel_http(crate_kernel_endpoint: str, logger, headers: Optional[dict] = None) -> str:\n    \"\"\"Create kernel using http.\"\"\"\n\n    r = requests.post(crate_kernel_endpoint, headers=headers)\n    if r.status_code != 201:\n        error_details = {\n            \"status_code\": r.status_code,\n            \"headers\": dict(r.headers),\n            \"url\": r.url,\n            \"body\": r.text,\n            \"request_method\": r.request.method,\n            \"request_headers\": dict(r.request.headers),\n            \"request_body\": r.request.body,\n        }\n        logger.log_error(f\"Failed to create kernel. Details: {json.dumps(error_details, indent=2)}\")\n        raise RuntimeError(f\"Failed to create kernel: Status {r.status_code}\\nResponse: {r.text}\") from None\n    return r.json()[\"id\"]\n\n\nclass DockerExecutor(RemotePythonExecutor):\n    \"\"\"\n    Remote Python code executor using Jupyter Kernel Gateway in a Docker container.\n\n    Args:\n        additional_imports (`list[str]`): Additional Python packages to install.\n        logger (`Logger`): Logger to use for output and errors.\n        allow_pickle (`bool`, default `False`): Whether to allow pickle serialization for objects that cannot be safely serialized to JSON.\n            - `False` (default, recommended): Only safe JSON serialization is used. Raises error if object cannot be safely serialized.\n            - `True` (legacy mode): Tries safe JSON serialization first, falls back to pickle with warning if needed.\n\n            **Security Warning:** Pickle deserialization can execute arbitrary code. Only set `allow_pickle=True`\n            if you fully trust the execution environment and need backward compatibility with custom types.\n        host (`str`, default `\"127.0.0.1\"`): Host to bind to.\n        port (`int`, default `8888`): Port to bind to.\n        image_name (`str`, default `\"jupyter-kernel\"`): Name of the Docker image to use. If the image doesn't exist, it will be built.\n        build_new_image (`bool`, default `True`): Whether to rebuild a new image even if it already exists.\n        container_run_kwargs (`dict`, *optional*): Additional keyword arguments to pass to the Docker container run command.\n        dockerfile_content (`str`, *optional*): Custom Dockerfile content. If `None`, uses default.\n    \"\"\"\n\n    def __init__(\n        self,\n        additional_imports: list[str],\n        logger,\n        allow_pickle: bool = False,\n        host: str = \"127.0.0.1\",\n        port: int = 8888,\n        image_name: str = \"jupyter-kernel\",\n        build_new_image: bool = True,\n        container_run_kwargs: dict[str, Any] | None = None,\n        dockerfile_content: str | None = None,\n    ):\n        super().__init__(additional_imports, logger, allow_pickle)\n        try:\n            import docker\n        except ModuleNotFoundError:\n            raise ModuleNotFoundError(\n                \"Please install 'docker' extra to use DockerExecutor: `pip install 'smolagents[docker]'`\"\n            )\n        self.host = host\n        self.port = port\n        self.image_name = image_name\n\n        self.dockerfile_content = dockerfile_content or dedent(\n            \"\"\"\\\n            FROM python:3.12-bullseye\n\n            RUN pip install jupyter_kernel_gateway jupyter_client ipykernel\n\n            EXPOSE 8888\n            CMD [\"jupyter\", \"kernelgateway\", \"--KernelGatewayApp.ip=0.0.0.0\", \"--KernelGatewayApp.port=8888\"]\n            \"\"\"\n        )\n\n        # Initialize Docker\n        try:\n            self.client = docker.from_env()\n        except docker.errors.DockerException as e:\n            raise RuntimeError(\"Could not connect to Docker daemon: make sure Docker is running.\") from e\n\n        # Build and start container\n        try:\n            # Check if image exists, unless forced to rebuild\n            if not build_new_image:\n                try:\n                    self.client.images.get(self.image_name)\n                    self.logger.log(f\"Using existing Docker image: {self.image_name}\", level=LogLevel.INFO)\n                except docker.errors.ImageNotFound:\n                    self.logger.log(f\"Image {self.image_name} not found, building...\", level=LogLevel.INFO)\n                    build_new_image = True\n\n            if build_new_image:\n                self.logger.log(f\"Building Docker image {self.image_name}...\", level=LogLevel.INFO)\n                dockerfile_obj = BytesIO(self.dockerfile_content.encode(\"utf-8\"))\n                _, build_logs = self.client.images.build(fileobj=dockerfile_obj, tag=self.image_name)\n                for log_chunk in build_logs:\n                    # Only log non-empty messages\n                    if log_message := log_chunk.get(\"stream\", \"\").rstrip():\n                        self.logger.log(log_message, level=LogLevel.DEBUG)\n\n            self.logger.log(f\"Starting container on {host}:{port}...\", level=LogLevel.INFO)\n            # Create base container parameters\n            container_kwargs = {}\n            if container_run_kwargs:\n                container_kwargs.update(container_run_kwargs)\n\n            # Ensure required port mapping and background running\n            if not isinstance(container_kwargs.get(\"ports\"), dict):\n                container_kwargs[\"ports\"] = {}\n            container_kwargs[\"ports\"][\"8888/tcp\"] = (host, port)\n            container_kwargs[\"detach\"] = True\n\n            # Generate auth token and pass it to the kernel gateway via the standard KG_AUTH_TOKEN env var\n            token = secrets.token_urlsafe(16)\n            env = container_kwargs.get(\"environment\") or {}\n            if isinstance(env, list):\n                env = dict(kv.split(\"=\", 1) for kv in env if \"=\" in kv)\n            env[\"KG_AUTH_TOKEN\"] = token\n            container_kwargs[\"environment\"] = env\n\n            self.container = self.client.containers.run(self.image_name, **container_kwargs)\n\n            retries = 0\n            while self.container.status != \"running\" and retries < 5:\n                self.logger.log(f\"Container status: {self.container.status}, waiting...\", level=LogLevel.INFO)\n                time.sleep(1)\n                self.container.reload()\n                retries += 1\n\n            self.base_url = f\"http://{host}:{port}\"\n\n            # Wait for Jupyter to start\n            self._wait_for_server(token)\n\n            # Create new kernel via HTTP\n            self.kernel_id = _create_kernel_http(f\"{self.base_url}/api/kernels?token={token}\", self.logger)\n            self.ws_url = f\"ws://{host}:{port}/api/kernels/{self.kernel_id}/channels?token={token}\"\n\n            self.installed_packages = self.install_packages(additional_imports)\n            self.logger.log(\n                f\"Container {self.container.short_id} is running with kernel {self.kernel_id}\", level=LogLevel.INFO\n            )\n\n        except Exception as e:\n            self.cleanup()\n            raise RuntimeError(f\"Failed to initialize Jupyter kernel: {e}\") from e\n\n    def run_code_raise_errors(self, code: str) -> CodeOutput:\n        \"\"\"\n        Execute Python code in the Docker container and return the result.\n\n        Args:\n            code (`str`): Python code to execute.\n\n        Returns:\n            `CodeOutput`: Code output containing the result, logs, and whether it is the final answer.\n        \"\"\"\n        from websocket import create_connection\n\n        with closing(create_connection(self.ws_url)) as ws:\n            return _websocket_run_code_raise_errors(code, ws, self.logger, self.allow_pickle)\n\n    def cleanup(self):\n        \"\"\"Clean up the Docker container and resources.\"\"\"\n        try:\n            if hasattr(self, \"container\"):\n                self.logger.log(f\"Stopping and removing container {self.container.short_id}...\", level=LogLevel.INFO)\n                self.container.stop()\n                self.container.remove()\n                self.logger.log(\"Container cleanup completed\", level=LogLevel.INFO)\n                del self.container\n        except Exception as e:\n            self.logger.log_error(f\"Error during cleanup: {e}\")\n\n    def delete(self):\n        \"\"\"Ensure cleanup on deletion.\"\"\"\n        self.cleanup()\n\n    def _wait_for_server(self, token: str):\n        retries = 0\n        jupyter_ready = False\n        while not jupyter_ready and retries < 10:\n            try:\n                if requests.get(f\"{self.base_url}/api/kernelspecs?token={token}\", timeout=2).status_code == 200:\n                    jupyter_ready = True\n                else:\n                    self.logger.log(\"Jupyter not ready, waiting...\", level=LogLevel.INFO)\n            except requests.RequestException:\n                self.logger.log(\"Jupyter not ready, waiting...\", level=LogLevel.INFO)\n            if not jupyter_ready:\n                time.sleep(1)\n                retries += 1\n\n\nclass ModalExecutor(RemotePythonExecutor):\n    \"\"\"\n    Remote Python code executor in a Modal sandbox.\n\n    Args:\n        additional_imports (`list[str]`): Additional Python packages to install.\n        logger (`Logger`): Logger to use for output and errors.\n        allow_pickle (`bool`, default `False`): Whether to allow pickle serialization for objects that cannot be safely serialized to JSON.\n            - `False` (default, recommended): Only safe JSON serialization is used. Raises error if object cannot be safely serialized.\n            - `True` (legacy mode): Tries safe JSON serialization first, falls back to pickle with warning if needed.\n\n            **Security Warning:** Pickle deserialization can execute arbitrary code. Only set `allow_pickle=True`\n            if you fully trust the execution environment and need backward compatibility with custom types.\n        app_name (`str`, default `\"smolagent-executor\"`): App name.\n        port (`int`, default `8888`): Port for jupyter to bind to.\n        create_kwargs (`dict`, *optional*): Additional keyword arguments to pass to the Modal Sandbox create command. See\n            `modal.Sandbox.create` [docs](https://modal.com/docs/reference/modal.Sandbox#create) for all the\n            keyword arguments.\n    \"\"\"\n\n    _ANSI_ESCAPE = re.compile(r\"\\x1B(?:[@-Z\\\\-_]|\\[[0-?]*[ -/]*[@-~])\")\n\n    def __init__(\n        self,\n        additional_imports: list[str],\n        logger,\n        allow_pickle: bool = False,\n        app_name: str = \"smolagent-executor\",\n        port: int = 8888,\n        create_kwargs: Optional[dict] = None,\n    ):\n        super().__init__(additional_imports, logger, allow_pickle)\n        self.port = port\n        try:\n            import modal\n        except ModuleNotFoundError:\n            raise ModuleNotFoundError(\n                \"\"\"Please install 'modal' extra to use ModalExecutor: `pip install 'smolagents[modal]'`\"\"\"\n            )\n\n        if create_kwargs is None:\n            create_kwargs = {}\n\n        create_kwargs = {\n            \"image\": modal.Image.debian_slim().uv_pip_install(\"jupyter_kernel_gateway\", \"ipykernel\"),\n            \"timeout\": 60 * 5,\n            **create_kwargs,\n        }\n\n        if \"app\" not in create_kwargs:\n            create_kwargs[\"app\"] = modal.App.lookup(app_name, create_if_missing=True)\n\n        if \"encrypted_ports\" not in create_kwargs:\n            create_kwargs[\"encrypted_ports\"] = [port]\n        else:\n            create_kwargs[\"encrypted_ports\"] = create_kwargs[\"encrypted_ports\"] + [port]\n\n        token = secrets.token_urlsafe(16)\n        default_secrets = [modal.Secret.from_dict({\"KG_AUTH_TOKEN\": token})]\n\n        if \"secrets\" not in create_kwargs:\n            create_kwargs[\"secrets\"] = default_secrets\n        else:\n            create_kwargs[\"secrets\"] = create_kwargs[\"secrets\"] + default_secrets\n\n        entrypoint = [\n            \"jupyter\",\n            \"kernelgateway\",\n            \"--KernelGatewayApp.ip=0.0.0.0\",\n            f\"--KernelGatewayApp.port={port}\",\n        ]\n\n        self.logger.log(\"Starting Modal sandbox\", level=LogLevel.INFO)\n        self.sandbox = modal.Sandbox.create(\n            *entrypoint,\n            **create_kwargs,\n        )\n\n        tunnel = self.sandbox.tunnels()[port]\n        self.logger.log(f\"Waiting for Modal sandbox on {tunnel.host}:{port}\", level=LogLevel.INFO)\n        self._wait_for_server(tunnel.host, token)\n\n        self.logger.log(\"Starting Jupyter kernel\", level=LogLevel.INFO)\n        kernel_id = _create_kernel_http(f\"https://{tunnel.host}/api/kernels?token={token}\", logger)\n        self.ws_url = f\"wss://{tunnel.host}/api/kernels/{kernel_id}/channels?token={token}\"\n        self.installed_packages = self.install_packages(additional_imports)\n\n    def run_code_raise_errors(self, code: str) -> CodeOutput:\n        \"\"\"\n        Execute Python code in the Modal sandbox and return the result.\n\n        Args:\n            code (`str`): Python code to execute.\n\n        Returns:\n            `CodeOutput`: Code output containing the result, logs, and whether it is the final answer.\n        \"\"\"\n        from websocket import create_connection\n\n        with closing(create_connection(self.ws_url)) as ws:\n            return _websocket_run_code_raise_errors(code, ws, self.logger, self.allow_pickle)\n\n    def cleanup(self):\n        \"\"\"Clean up the Modal sandbox by terminating it.\"\"\"\n        if hasattr(self, \"sandbox\"):\n            self.sandbox.terminate()\n\n    def delete(self):\n        \"\"\"Ensure cleanup on deletion.\"\"\"\n        self.cleanup()\n\n    def _wait_for_server(self, host: str, token: str):\n        \"\"\"Wait for server to start up.\"\"\"\n        n_retries = 0\n        while True:\n            try:\n                resp = requests.get(f\"https://{host}/api/kernelspecs?token={token}\")\n                if resp.status_code == 200:\n                    break\n            except RequestException:\n                n_retries += 1\n                if n_retries % 10 == 0:\n                    self.logger.log(\"Waiting for server to startup, retrying...\", level=LogLevel.INFO)\n                if n_retries > 60:\n                    raise RuntimeError(\"Unable to connect to sandbox\")\n                time.sleep(1.0)\n\n    @classmethod\n    def _strip_ansi_colors(cls, text: str) -> str:\n        \"\"\"Remove ansi colors from text.\"\"\"\n        return cls._ANSI_ESCAPE.sub(\"\", text)\n\n\nclass BlaxelExecutor(RemotePythonExecutor):\n    \"\"\"\n    Remote Python code executor in a Blaxel sandbox.\n\n    Blaxel provides fast-launching virtual machines that start from hibernation in under 25ms\n    and scale back to zero after inactivity while maintaining memory state.\n\n    Args:\n        additional_imports (`list[str]`): Additional Python packages to install.\n        logger (`Logger`): Logger to use for output and errors.\n        allow_pickle (`bool`, default `False`): Whether to allow pickle serialization for objects that cannot be safely serialized to JSON.\n            - `False` (default, recommended): Only safe JSON serialization is used. Raises error if object cannot be safely serialized.\n            - `True` (legacy mode): Tries safe JSON serialization first, falls back to pickle with warning if needed.\n\n            **Security Warning:** Pickle deserialization can execute arbitrary code. Only set `allow_pickle=True`\n            if you fully trust the execution environment and need backward compatibility with custom types.\n        sandbox_name (`str`, *optional*): Name for the sandbox. Defaults to \"smolagent-executor\".\n        image (`str`, default `\"blaxel/jupyter-notebook\"`): Docker image to use.\n        memory (`int`, default `4096`): Memory allocation in MB.\n        ttl (`str`, *optional*): Time to live in seconds.\n        region (`str`, *optional*): Deployment region. If not specified, Blaxel chooses default.\n    \"\"\"\n\n    def __init__(\n        self,\n        additional_imports: list[str],\n        logger,\n        allow_pickle: bool = False,\n        sandbox_name: str | None = None,\n        image: str = \"blaxel/jupyter-notebook\",\n        memory: int = 4096,\n        ttl: str | None = None,\n        region: Optional[str] = None,\n    ):\n        super().__init__(additional_imports, logger, allow_pickle=allow_pickle)\n\n        try:\n            import blaxel  # noqa: F401\n        except ModuleNotFoundError:\n            raise ModuleNotFoundError(\n                \"Please install 'blaxel' extra to use BlaxelExecutor: `pip install 'smolagents[blaxel]'`\"\n            )\n\n        self.sandbox_name = sandbox_name or f\"smolagent-executor-{uuid.uuid4().hex[:8]}\"\n        self.image = image\n        self.memory = memory\n        self.region = region\n        self.port = 8888\n        self._cleaned_up = False  # Flag to prevent double cleanup\n\n        # Prepare sandbox creation parameters\n        token = secrets.token_urlsafe(16)\n        sandbox_config = {\n            \"metadata\": {\n                \"name\": self.sandbox_name,\n            },\n            \"spec\": {\n                \"runtime\": {\"image\": image, \"memory\": memory, \"ports\": [{\"target\": self.port}]},\n            },\n        }\n\n        if region:\n            sandbox_config[\"spec\"][\"region\"] = region\n\n        if ttl:\n            sandbox_config[\"spec\"][\"runtime\"][\"ttl\"] = ttl\n\n        # Create the sandbox\n        try:\n            # Create sandbox environment on Blaxel\n            self.sandbox = BlaxelExecutor._create_sandbox(sandbox_config)\n\n            # Create kernel via HTTP\n            from blaxel.core import settings\n\n            kernel_id = _create_kernel_http(\n                f\"{self.sandbox.metadata.url}/port/{self.port}/api/kernels?token={token}\",\n                self.logger,\n                headers=settings.headers,\n            )\n\n            # Set up websocket URL\n            # Convert http/https to ws/wss\n            ws_scheme = \"wss\" if self.sandbox.metadata.url.startswith(\"https\") else \"ws\"\n            ws_base = self.sandbox.metadata.url.replace(\"https://\", \"\").replace(\"http://\", \"\")\n            self.ws_url = f\"{ws_scheme}://{ws_base}/port/{self.port}/api/kernels/{kernel_id}/channels?token={token}\"\n\n            # Install additional packages\n            self.installed_packages = self.install_packages(additional_imports)\n            self.logger.log(\"Blaxel is running\", level=LogLevel.INFO)\n        except Exception as e:\n            self.cleanup()\n            raise RuntimeError(f\"Failed to initialize Blaxel sandbox: {e}\") from e\n\n    @staticmethod\n    def _create_sandbox(config):\n        \"\"\"Helper method to create sandbox asynchronously.\"\"\"\n        from blaxel.core import SandboxInstance\n        from blaxel.core.client import client\n        from blaxel.core.client.api.compute import create_sandbox\n\n        response = create_sandbox.sync(client=client, body=config)\n        return SandboxInstance(response)\n\n    def run_code_raise_errors(self, code: str) -> CodeOutput:\n        \"\"\"\n        Execute Python code in the Blaxel sandbox and return the result.\n\n        Args:\n            code (`str`): Python code to execute.\n\n        Returns:\n            `CodeOutput`: Code output containing the result, logs, and whether it is the final answer.\n        \"\"\"\n        from blaxel.core import settings\n        from websocket import create_connection\n\n        headers = []\n        for key, value in settings.headers.items():\n            headers.append(f\"{key}: {value}\")\n        with closing(create_connection(self.ws_url, header=headers)) as ws:\n            return _websocket_run_code_raise_errors(code, ws, self.logger, self.allow_pickle)\n\n    def install_packages(self, additional_imports: list[str]) -> list[str]:\n        \"\"\"Helper method to install packages asynchronously.\"\"\"\n        if not additional_imports:\n            return []\n\n        from blaxel.core import settings\n        from blaxel.core.sandbox.client import client\n        from blaxel.core.sandbox.client.api.process import get_process_identifier, post_process\n        from blaxel.core.sandbox.client.models import ErrorResponse, ProcessResponse\n\n        try:\n            client.with_base_url(self.sandbox.metadata.url)\n            client.with_headers(settings.headers)\n\n            # Install packages using pip via run_code\n            self.logger.log(f\"Installing packages: {', '.join(additional_imports)}\", level=LogLevel.INFO)\n            pip_install_code = f\"pip install --root-user-action=ignore {' '.join(additional_imports)}\"\n\n            identifier = \"install-packages\"\n            body = {\n                \"name\": identifier,\n                \"command\": pip_install_code,\n            }\n            post_process.sync(client=client, body=body)\n\n            status = \"running\"\n            interval = 1000\n            max_wait = 600000\n            start_time = time.time() * 1000\n            logs = \"\"\n            exit_code = 0\n\n            while status == \"running\":\n                if (time.time() * 1000) - start_time > max_wait:\n                    raise Exception(\"Process did not finish in time\")\n                data = get_process_identifier.sync(identifier, client=client)\n                if isinstance(data, ProcessResponse):\n                    status = data.status or \"running\"\n                    exit_code = data.exit_code\n                    logs = data.logs\n                elif isinstance(data, ErrorResponse):\n                    raise Exception(f\"Failed to install packages: {data.message}\")\n                else:\n                    raise Exception(f\"Unknown response: {data}\")\n\n                if status == \"running\":\n                    time.sleep(interval / 1000)  # Convert to seconds\n\n            if exit_code != 0:\n                self.logger.log_error(f\"Failed to install packages (exit code {exit_code}): {logs}\")\n                return []\n\n            self.logger.log(f\"Successfully installed packages: {', '.join(additional_imports)}\", level=LogLevel.INFO)\n            return additional_imports\n\n        except Exception as e:\n            self.logger.log_error(f\"Error installing packages: {e}\")\n            return []\n\n    def _delete_sandbox(self):\n        \"\"\"Delete sandbox using Blaxel's sync API and wait for completion.\"\"\"\n        from blaxel.core.client import client\n        from blaxel.core.client.api.compute import delete_sandbox\n\n        self.logger.log(f\"Requesting sandbox {self.sandbox_name} deletion...\", level=LogLevel.INFO)\n        delete_sandbox.sync(client=client, sandbox_name=self.sandbox_name)\n\n    def cleanup(self):\n        \"\"\"Sync wrapper to clean up sandbox and resources.\"\"\"\n        # Prevent double cleanup\n        if self._cleaned_up:\n            return\n        self.logger.log(\"Shutting down sandbox...\", level=LogLevel.INFO)\n        self._cleaned_up = True\n        try:\n            self._delete_sandbox()\n        except Exception as e:\n            # Log cleanup errors but don't raise - cleanup should be best-effort\n            self.logger.log(f\"Error during cleanup: : {e}\", level=LogLevel.INFO)\n        finally:\n            # Always clean up local references\n            if hasattr(self, \"sandbox\"):\n                del self.sandbox\n            self.logger.log(\"Sandbox cleanup completed\", level=LogLevel.INFO)\n\n    def delete(self):\n        \"\"\"Ensure cleanup on deletion.\"\"\"\n        self.cleanup()\n\n    def __del__(self):\n        \"\"\"Ensure cleanup on deletion.\"\"\"\n        try:\n            self.cleanup()\n        except Exception:\n            pass  # Silently ignore errors during cleanup\n\n\nclass WasmExecutor(RemotePythonExecutor):\n    \"\"\"\n    Remote Python code executor in a sandboxed WebAssembly environment powered by Pyodide and Deno.\n\n    This executor combines Deno's secure runtime with Pyodide's WebAssembly‑compiled Python interpreter to deliver s\n    trong isolation guarantees while enabling full Python execution.\n\n    Args:\n        additional_imports (`list[str]`): Additional Python packages to install in the Pyodide environment.\n        logger (`Logger`): Logger to use for output and errors.\n        allow_pickle (`bool`, default `False`): Whether to allow pickle serialization for objects that cannot be safely serialized to JSON.\n            - `False` (default, recommended): Only safe JSON serialization is used. Raises error if object cannot be safely serialized.\n            - `True` (legacy mode): Tries safe JSON serialization first, falls back to pickle with warning if needed.\n\n            **Security Warning:** Pickle deserialization can execute arbitrary code. Only set `allow_pickle=True`\n            if you fully trust the execution environment and need backward compatibility with custom types.\n        deno_path (`str`, default `\"deno\"`): Path to the Deno executable. If not provided, will use \"deno\" from PATH.\n        deno_permissions (`list[str]`, *optional*): List of permissions to grant to the Deno runtime.\n            Default is minimal permissions needed for execution.\n        timeout (`int`, default `60`): Timeout in seconds for code execution\n    \"\"\"\n\n    DEFAULT_SERVER_HOST = \"127.0.0.1\"\n    DEFAULT_SERVER_PORT = 8000\n\n    def __init__(\n        self,\n        additional_imports: list[str],\n        logger,\n        allow_pickle: bool = False,\n        deno_path: str = \"deno\",\n        deno_permissions: list[str] | None = None,\n        timeout: int = 60,\n    ):\n        super().__init__(additional_imports, logger, allow_pickle=allow_pickle)\n\n        # Check if Deno is installed\n        try:\n            subprocess.run([deno_path, \"--version\"], capture_output=True, check=True)\n        except (subprocess.SubprocessError, FileNotFoundError):\n            raise RuntimeError(\n                \"Deno is not installed or not found in PATH. Please install Deno from https://deno.land/\"\n            )\n\n        self.deno_path = deno_path\n        self.timeout = timeout\n        self.token = secrets.token_urlsafe(16)\n        self.session = requests.Session()\n        self.session.headers[\"Authorization\"] = f\"Bearer {self.token}\"\n        self.server_host = self.DEFAULT_SERVER_HOST\n        self.server_port = self.DEFAULT_SERVER_PORT\n\n        # Create the Deno JavaScript runner file\n        self._create_deno_runner()\n\n        # Default minimal permissions needed\n        if deno_permissions is None:\n            deno_permissions = [\n                \"allow-net=\"\n                + \",\".join(\n                    [\n                        f\"{self.server_host}:{self.server_port}\",  # allow requests to the local server\n                        \"cdn.jsdelivr.net:443\",  # allow loading pyodide packages\n                        \"pypi.org:443,files.pythonhosted.org:443\",  # allow pyodide install packages from PyPI\n                    ]\n                ),\n                # FS permissions are always scoped to deno_cache_dir (the per-instance temp dir\n                # that cleanup() removes). This replaces the original global ~/.cache/deno\n                # permissions, bounding any write from attacker code to a short-lived\n                # directory that never affects other Deno processes or persists past teardown.\n                # --allow-read: required for Deno to load npm package assets at runtime\n                #               (e.g. pyodide.asm.wasm is read via Deno file APIs).\n                # --allow-write: required for pyodide's loadPackage() to cache downloaded\n                #                Python packages (e.g. micropip) to the Deno-backed FS.\n                f\"allow-read={self.deno_cache_dir}\",\n                f\"allow-write={self.deno_cache_dir}\",\n            ]\n        self.deno_permissions = [f\"--{perm}\" for perm in deno_permissions]\n\n        # Start the Deno server\n        self._start_deno_server()\n\n        # Install additional packages\n        self.installed_packages = self.install_packages(additional_imports)\n        self.logger.log(\"WasmExecutor is running\", level=LogLevel.INFO)\n\n    def _create_deno_runner(self):\n        \"\"\"Create the Deno JavaScript file that will run Pyodide and execute Python code.\"\"\"\n        # Create an isolated per-executor runtime directory to avoid sharing mutable Deno state\n        self.runner_dir = tempfile.mkdtemp(prefix=\"pyodide_deno_\")\n        self.runner_path = os.path.join(self.runner_dir, \"pyodide_runner.js\")\n\n        # Create the JavaScript runner file\n        with open(self.runner_path, \"w\") as f:\n            f.write(self._build_js_code())\n\n        # Isolate Deno's module cache inside the per-instance temp directory so it\n        # cannot affect other Deno processes and is removed when cleanup() runs.\n        self.deno_cache_dir = os.path.join(self.runner_dir, \"deno_cache\")\n\n    def _build_js_code(self) -> str:\n        \"\"\"Render JavaScript runner with injected auth token and with configured server host and port.\"\"\"\n        return (\n            self.JS_CODE_TEMPLATE.replace(\"__AUTH_TOKEN__\", self.token)\n            .replace(\"__SERVER_HOST__\", self.server_host)\n            .replace(\"__SERVER_PORT__\", str(self.server_port))\n        )\n\n    def _start_deno_server(self):\n        \"\"\"Start the Deno server that will run our JavaScript code.\"\"\"\n        cmd = [self.deno_path, \"run\"] + self.deno_permissions + [self.runner_path]\n\n        # Start the server process\n        self.server_process = subprocess.Popen(\n            cmd,\n            stdout=subprocess.PIPE,\n            stderr=subprocess.PIPE,\n            text=True,\n            env={**os.environ, \"DENO_DIR\": self.deno_cache_dir},\n        )\n\n        # Wait for the server to start\n        time.sleep(2)  # Give the server time to start\n\n        # Check if the server started successfully\n        if self.server_process.poll() is not None:\n            stderr = self.server_process.stderr.read()\n            raise RuntimeError(f\"Failed to start Deno server: {stderr}\")\n\n        self.server_url = f\"http://{self.server_host}:{self.server_port}\"\n\n        # Test the connection\n        try:\n            response = self.session.get(self.server_url)\n            if response.status_code != 200:\n                raise RuntimeError(f\"Server responded with status code {response.status_code}: {response.text}\")\n        except requests.RequestException as e:\n            raise RuntimeError(f\"Failed to connect to Deno server: {e}\")\n\n    def run_code_raise_errors(self, code: str) -> CodeOutput:\n        \"\"\"\n        Execute Python code in the Pyodide environment and return the result.\n\n        Args:\n            code (`str`): Python code to execute.\n\n        Returns:\n            `CodeOutput`: Code output containing the result, logs, and whether it is the final answer.\n        \"\"\"\n        try:\n            # Prepare the request payload\n            payload = {\n                \"code\": code,\n                \"packages\": self.installed_packages,\n            }\n\n            # Send the request to the Deno server\n            response = self.session.post(self.server_url, json=payload, timeout=self.timeout)\n\n            if response.status_code != 200:\n                raise AgentError(f\"Server error: {response.text}\", self.logger)\n\n            result = None\n            is_final_answer = False\n\n            # Parse the response\n            result_data = response.json()\n\n            # Process the result\n            if result_data.get(\"result\"):\n                result = result_data.get(\"result\")\n            # Check for execution errors\n            elif result_data.get(\"error\"):\n                error = result_data[\"error\"]\n                if (\n                    error.get(\"pythonExceptionType\") == RemotePythonExecutor.FINAL_ANSWER_EXCEPTION\n                    and \"pythonExceptionValue\" in error\n                ):\n                    result = self._deserialize_final_answer(error[\"pythonExceptionValue\"], self.allow_pickle)\n                    is_final_answer = True\n                else:\n                    error_message = f\"{error.get('name', 'Error')}: {error.get('message', 'Unknown error')}\"\n                    if \"stack\" in error:\n                        error_message += f\"\\n{error['stack']}\"\n                    raise AgentError(error_message, self.logger)\n\n            # Get the execution logs\n            execution_logs = result_data.get(\"stdout\", \"\")\n\n            # Handle image results\n            if isinstance(result, dict) and result.get(\"type\") == \"image\":\n                image_data = result.get(\"data\", \"\")\n                decoded_bytes = base64.b64decode(image_data.encode(\"utf-8\"))\n                return PIL.Image.open(BytesIO(decoded_bytes)), execution_logs\n\n            return CodeOutput(output=result, logs=execution_logs, is_final_answer=is_final_answer)\n\n        except requests.RequestException as e:\n            raise AgentError(f\"Failed to communicate with Deno server: {e}\", self.logger)\n\n    def install_packages(self, additional_imports: list[str]) -> list[str]:\n        \"\"\"\n        Install additional Python packages in the Pyodide environment.\n\n        Args:\n            additional_imports (`list[str]`): Package names to install.\n\n        Returns:\n            list[str]: Installed packages.\n        \"\"\"\n        # In Pyodide, we don't actually install packages here, but we keep track of them\n        # to load them when executing code\n        # TODO: Install  here instead?\n        self.logger.log(f\"Adding packages to load: {', '.join(additional_imports)}\", level=LogLevel.INFO)\n        return additional_imports\n\n    def cleanup(self):\n        \"\"\"Clean up resources used by the executor.\"\"\"\n        if hasattr(self, \"session\"):\n            self.session.close()\n\n        if hasattr(self, \"server_process\") and self.server_process:\n            self.logger.log(\"Stopping Deno server...\", level=LogLevel.INFO)\n            self.server_process.terminate()\n            try:\n                self.server_process.wait(timeout=5)\n            except subprocess.TimeoutExpired:\n                self.server_process.kill()\n\n        # Remove the temporary directory\n        if hasattr(self, \"runner_dir\") and os.path.exists(self.runner_dir):\n            import shutil\n\n            shutil.rmtree(self.runner_dir)\n\n    def delete(self):\n        \"\"\"Ensure cleanup on deletion.\"\"\"\n        self.cleanup()\n\n    JS_CODE_TEMPLATE = dedent(\"\"\"\\\n        // pyodide_runner.js - Runs Python code in Pyodide within Deno\n        import { serve } from \"https://deno.land/std/http/server.ts\";\n        import { loadPyodide } from \"npm:pyodide\";\n\n        const AUTH_TOKEN = \"__AUTH_TOKEN__\";\n\n        // Initialize Pyodide instance\n        const pyodidePromise = loadPyodide();\n\n        // Function to execute Python code and return the result\n        async function executePythonCode(code) {\n          const pyodide = await pyodidePromise;\n\n          // Create a capture for stdout\n          pyodide.runPython(`\n            import sys\n            import io\n            sys.stdout = io.StringIO()\n          `);\n\n          // Execute the code and capture any errors\n          let result = null;\n          let error = null;\n          let stdout = \"\";\n\n          try {\n            // Execute the code\n            result = await pyodide.runPythonAsync(code);\n\n            // Get captured stdout\n            stdout = pyodide.runPython(\"sys.stdout.getvalue()\");\n          } catch (e) {\n            error = {\n              name: e.constructor.name,\n              message: e.message,\n              stack: e.stack\n            };\n\n            // Extract Python exception details\n            if (e.constructor.name === \"PythonError\") {\n              // Get the Python exception type from the error message: at the end of the traceback\n              const errorMatch = e.message.match(/\\\\n([^:]+Exception): /);\n              if (errorMatch) {\n                error.pythonExceptionType = errorMatch[1].split(\".\").pop();\n              }\n\n              // If the error is a FinalAnswerException, extract its the encoded value\n              if (error.pythonExceptionType === \"FinalAnswerException\") {\n                // Extract the base64 encoded value from the error message\n                const valueMatch = e.message.match(/FinalAnswerException: (.*?)(?:\\\\n|$)/);\n                if (valueMatch) {\n                  error.pythonExceptionValue = valueMatch[1];\n                }\n              }\n            }\n          }\n\n          return {\n            result,\n            stdout,\n            error\n          };\n        }\n\n        // Start a simple HTTP server to receive code execution requests\n\n        serve(async (req) => {\n          const authHeader = req.headers.get(\"Authorization\");\n          if (!authHeader || authHeader !== `Bearer ${AUTH_TOKEN}`) {\n            return new Response(\"Unauthorized\", { status: 401 });\n          }\n\n          if (req.method === \"POST\") {\n            try {\n              const body = await req.json();\n              const { code, packages = [] } = body;\n\n              // Load any requested packages\n              if (packages && packages.length > 0) {\n                const pyodide = await pyodidePromise;\n                //await pyodide.loadPackagesFromImports(code);\n                await pyodide.loadPackage(\"micropip\");\n                const micropip = pyodide.pyimport(\"micropip\");\n                try {\n                  await micropip.install(packages);\n                } catch (e) {\n                  console.error(`Failed to load package ${packages}: ${e.message}`);\n                }\n              }\n\n              const result = await executePythonCode(code);\n              return new Response(JSON.stringify(result), {\n                headers: { \"Content-Type\": \"application/json\" }\n              });\n            } catch (e) {\n              return new Response(JSON.stringify({ error: e.message }), {\n                status: 500,\n                headers: { \"Content-Type\": \"application/json\" }\n              });\n            }\n          }\n\n          return new Response(\"Pyodide-Deno Executor is running. Send POST requests with code to execute.\", {\n            headers: { \"Content-Type\": \"text/plain\" }\n          });\n        }, { hostname: \"__SERVER_HOST__\", port: __SERVER_PORT__ });\n        \"\"\")\n"
  },
  {
    "path": "src/smolagents/serialization.py",
    "content": "#!/usr/bin/env python\n# coding=utf-8\n\n# Copyright 2024 The HuggingFace Inc. team. All rights reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\n\"\"\"\nSafe serialization module for remote executor communication.\n\nProvides JSON-based serialization with optional pickle fallback for types\nthat cannot be safely serialized.\n\n**Security Note:** Pickle deserialization can execute arbitrary code. This module\ndefaults to safe JSON-only serialization. Only enable pickle fallback\n(allow_insecure_serializer=True) if you fully trust the execution environment.\n\"\"\"\n\nimport base64\nimport json\nimport pickle\nfrom io import BytesIO\nfrom typing import Any\n\n\n__all__ = [\"SerializationError\", \"SafeSerializer\"]\n\n\nclass SerializationError(Exception):\n    \"\"\"Raised when a type cannot be safely serialized.\"\"\"\n\n    pass\n\n\nclass SafeSerializer:\n    \"\"\"JSON-based serializer with type markers for safe serialization.\n\n    Supports:\n    - Basic: str, int, float, bool, None, list, dict\n    - Extended: tuple, set, frozenset, bytes, complex, datetime/date/time/timedelta\n    - Optional: numpy.ndarray, PIL.Image, dataclasses, Decimal, Path\n\n    The serializer uses a prefix system to distinguish between formats:\n    - \"safe:\" prefix for JSON-serialized data\n    - \"pickle:\" prefix for pickle-serialized data (when allowed)\n    \"\"\"\n\n    SAFE_PREFIX = \"safe:\"\n\n    # Cache for optional type classes (avoids repeated import attempts)\n    _optional_types_cache: dict = {}\n\n    @classmethod\n    def _get_optional_type(cls, module: str, attr: str):\n        \"\"\"Get optional type class with caching to avoid repeated imports.\"\"\"\n        key = f\"{module}.{attr}\"\n        if key not in cls._optional_types_cache:\n            try:\n                mod = __import__(module, fromlist=[attr])\n                cls._optional_types_cache[key] = getattr(mod, attr)\n            except (ImportError, AttributeError):\n                cls._optional_types_cache[key] = None\n        return cls._optional_types_cache[key]\n\n    @staticmethod\n    def to_json_safe(obj: Any) -> Any:\n        \"\"\"Convert Python objects to JSON-serializable format with type markers.\n\n        Args:\n            obj: Object to convert.\n\n        Returns:\n            JSON-serializable representation.\n\n        Raises:\n            SerializationError: If the object cannot be safely serialized.\n        \"\"\"\n        # Fast path: use exact type check for primitives (most common case)\n        obj_type = type(obj)\n        if obj_type is str or obj_type is int or obj_type is float or obj_type is bool or obj is None:\n            return obj\n\n        # Fast path: list (very common for return values)\n        if obj_type is list:\n            return [SafeSerializer.to_json_safe(item) for item in obj]\n\n        # Fast path: tuple (common for multiple return values)\n        if obj_type is tuple:\n            return {\"__type__\": \"tuple\", \"data\": [SafeSerializer.to_json_safe(item) for item in obj]}\n\n        # Fast path: dict (common, check string keys)\n        if obj_type is dict:\n            if all(type(k) is str for k in obj):\n                return {k: SafeSerializer.to_json_safe(v) for k, v in obj.items()}\n            return {\n                \"__type__\": \"dict_with_complex_keys\",\n                \"data\": [[SafeSerializer.to_json_safe(k), SafeSerializer.to_json_safe(v)] for k, v in obj.items()],\n            }\n\n        # Other builtin types - exact type checks\n        if obj_type is set:\n            return {\"__type__\": \"set\", \"data\": [SafeSerializer.to_json_safe(item) for item in obj]}\n        if obj_type is frozenset:\n            return {\"__type__\": \"frozenset\", \"data\": [SafeSerializer.to_json_safe(item) for item in obj]}\n        if obj_type is bytes:\n            return {\"__type__\": \"bytes\", \"data\": base64.b64encode(obj).decode()}\n        if obj_type is complex:\n            return {\"__type__\": \"complex\", \"real\": obj.real, \"imag\": obj.imag}\n\n        # Use type module/name for lazy-loaded types (avoids import until needed)\n        type_module = getattr(obj_type, \"__module__\", \"\")\n        type_name = obj_type.__name__\n\n        # datetime module types (check module first to skip unrelated types quickly)\n        if type_module == \"datetime\":\n            if type_name == \"datetime\":\n                return {\"__type__\": \"datetime\", \"data\": obj.isoformat()}\n            if type_name == \"date\":\n                return {\"__type__\": \"date\", \"data\": obj.isoformat()}\n            if type_name == \"time\":\n                return {\"__type__\": \"time\", \"data\": obj.isoformat()}\n            if type_name == \"timedelta\":\n                return {\"__type__\": \"timedelta\", \"total_seconds\": obj.total_seconds()}\n\n        # decimal.Decimal\n        if type_module == \"decimal\" and type_name == \"Decimal\":\n            return {\"__type__\": \"Decimal\", \"data\": str(obj)}\n\n        # pathlib.Path (and subclasses like PosixPath, WindowsPath)\n        if type_module.startswith(\"pathlib\") and \"Path\" in type_name:\n            return {\"__type__\": \"Path\", \"data\": str(obj)}\n\n        # PIL.Image - use cached import\n        pil_image_cls = SafeSerializer._get_optional_type(\"PIL.Image\", \"Image\")\n        if pil_image_cls is not None and isinstance(obj, pil_image_cls):\n            buffer = BytesIO()\n            obj.save(buffer, format=\"PNG\")\n            return {\"__type__\": \"PIL.Image\", \"data\": base64.b64encode(buffer.getvalue()).decode()}\n\n        # numpy types - use cached import\n        if type_module == \"numpy\" or type_module.startswith(\"numpy.\"):\n            np_ndarray = SafeSerializer._get_optional_type(\"numpy\", \"ndarray\")\n            if np_ndarray is not None and obj_type is np_ndarray:\n                return {\"__type__\": \"ndarray\", \"data\": obj.tolist(), \"dtype\": str(obj.dtype)}\n            np_integer = SafeSerializer._get_optional_type(\"numpy\", \"integer\")\n            np_floating = SafeSerializer._get_optional_type(\"numpy\", \"floating\")\n            if (np_integer and isinstance(obj, np_integer)) or (np_floating and isinstance(obj, np_floating)):\n                return obj.item()\n\n        # dataclass - check last as is_dataclass() has overhead\n        import dataclasses\n\n        if dataclasses.is_dataclass(obj) and not isinstance(obj, type):\n            return {\n                \"__type__\": \"dataclass\",\n                \"class_name\": type_name,\n                \"module\": type_module,\n                \"data\": {f.name: SafeSerializer.to_json_safe(getattr(obj, f.name)) for f in dataclasses.fields(obj)},\n            }\n\n        raise SerializationError(f\"Cannot safely serialize object of type {type_name}\")\n\n    @staticmethod\n    def from_json_safe(obj: Any) -> Any:\n        \"\"\"\n        Convert JSON-safe format back to Python objects.\n\n        Args:\n            obj: JSON-safe representation\n\n        Returns:\n            Original Python object\n        \"\"\"\n        if isinstance(obj, dict):\n            if \"__type__\" in obj:\n                obj_type = obj[\"__type__\"]\n                if obj_type == \"bytes\":\n                    return base64.b64decode(obj[\"data\"])\n                elif obj_type == \"PIL.Image\":\n                    try:\n                        import PIL.Image\n\n                        img_bytes = base64.b64decode(obj[\"data\"])\n                        return PIL.Image.open(BytesIO(img_bytes))\n                    except ImportError:\n                        return {\"__type__\": \"PIL.Image\", \"data\": obj[\"data\"]}\n                elif obj_type == \"set\":\n                    return set(SafeSerializer.from_json_safe(item) for item in obj[\"data\"])\n                elif obj_type == \"tuple\":\n                    return tuple(SafeSerializer.from_json_safe(item) for item in obj[\"data\"])\n                elif obj_type == \"complex\":\n                    return complex(obj[\"real\"], obj[\"imag\"])\n                elif obj_type == \"frozenset\":\n                    return frozenset(SafeSerializer.from_json_safe(item) for item in obj[\"data\"])\n                elif obj_type == \"dict_with_complex_keys\":\n                    return {SafeSerializer.from_json_safe(k): SafeSerializer.from_json_safe(v) for k, v in obj[\"data\"]}\n                elif obj_type == \"datetime\":\n                    from datetime import datetime\n\n                    return datetime.fromisoformat(obj[\"data\"])\n                elif obj_type == \"date\":\n                    from datetime import date\n\n                    return date.fromisoformat(obj[\"data\"])\n                elif obj_type == \"time\":\n                    from datetime import time\n\n                    return time.fromisoformat(obj[\"data\"])\n                elif obj_type == \"timedelta\":\n                    from datetime import timedelta\n\n                    return timedelta(seconds=obj[\"total_seconds\"])\n                elif obj_type == \"Decimal\":\n                    from decimal import Decimal\n\n                    return Decimal(obj[\"data\"])\n                elif obj_type == \"Path\":\n                    from pathlib import Path\n\n                    return Path(obj[\"data\"])\n                elif obj_type == \"ndarray\":\n                    try:\n                        import numpy as np\n\n                        return np.array(obj[\"data\"], dtype=obj[\"dtype\"])\n                    except ImportError:\n                        return obj[\"data\"]  # Return as list if numpy not available\n                elif obj_type == \"dataclass\":\n                    # For dataclasses, we return a dict representation\n                    # since we can't reconstruct the actual class without access to it\n                    return {\n                        \"__dataclass__\": obj[\"class_name\"],\n                        \"__module__\": obj[\"module\"],\n                        **{k: SafeSerializer.from_json_safe(v) for k, v in obj[\"data\"].items()},\n                    }\n            return {k: SafeSerializer.from_json_safe(v) for k, v in obj.items()}\n        elif isinstance(obj, list):\n            return [SafeSerializer.from_json_safe(item) for item in obj]\n        return obj\n\n    @staticmethod\n    def dumps(obj: Any, allow_pickle: bool = False) -> str:\n        \"\"\"\n        Serialize object to string.\n\n        Args:\n            obj: Object to serialize\n            allow_pickle: If False (default), use ONLY safe JSON serialization (error if fails).\n                         If True, try safe first, fallback to pickle with warning.\n\n        Returns:\n            str: Serialized string (\"safe:...\" for JSON, \"pickle:...\" for pickle)\n\n        Raises:\n            SerializationError: If allow_pickle=False and object cannot be safely serialized\n        \"\"\"\n        if not allow_pickle:\n            # Safe ONLY mode - no pickle fallback\n            json_safe = SafeSerializer.to_json_safe(obj)  # Raises SerializationError if fails\n            return SafeSerializer.SAFE_PREFIX + json.dumps(json_safe)\n        else:\n            # Try safe first, fallback to pickle\n            try:\n                json_safe = SafeSerializer.to_json_safe(obj)\n                return SafeSerializer.SAFE_PREFIX + json.dumps(json_safe)\n            except SerializationError:\n                # Warn about insecure pickle usage\n                import warnings\n\n                warnings.warn(\n                    \"Falling back to insecure pickle serialization. \"\n                    \"This is a security risk and will be removed in a future version. \"\n                    \"Consider using only safe serializable types (primitives, lists, dicts, \"\n                    \"numpy arrays, PIL images, datetime objects, dataclasses).\",\n                    FutureWarning,\n                    stacklevel=2,\n                )\n                # Fallback to pickle (with prefix)\n                try:\n                    return \"pickle:\" + base64.b64encode(pickle.dumps(obj)).decode()\n                except (pickle.PicklingError, TypeError, AttributeError) as e:\n                    raise SerializationError(f\"Cannot serialize object: {e}\") from e\n\n    @staticmethod\n    def loads(data: str, allow_pickle: bool = False) -> Any:\n        \"\"\"\n        Deserialize string with format detection.\n\n        Args:\n            data: Serialized string (with \"safe:\" or \"pickle:\" prefix)\n            allow_pickle: If False (default), reject pickle data (strict safe mode).\n                         If True, accept both safe and pickle formats.\n\n        Returns:\n            Deserialized object\n\n        Raises:\n            SerializationError: If pickle data received but allow_pickle=False\n        \"\"\"\n        if data.startswith(SafeSerializer.SAFE_PREFIX):\n            json_data = json.loads(data[len(SafeSerializer.SAFE_PREFIX) :])\n            return SafeSerializer.from_json_safe(json_data)\n        elif data.startswith(\"pickle:\"):\n            # Explicit pickle prefix\n            if not allow_pickle:\n                raise SerializationError(\n                    \"Pickle data rejected: allow_pickle=False requires safe-only data. \"\n                    \"This data is pickle-serialized. To deserialize it, set \"\n                    \"allow_pickle=True (not recommended for untrusted data).\"\n                )\n            # Warn about insecure pickle deserialization\n            import warnings\n\n            warnings.warn(\n                \"Deserializing pickle data. This is a security risk if the data is untrusted.\",\n                FutureWarning,\n                stacklevel=2,\n            )\n            return pickle.loads(base64.b64decode(data[7:]))\n        else:\n            # No prefix - legacy format, assume pickle\n            if not allow_pickle:\n                raise SerializationError(\n                    \"Pickle data rejected: allow_pickle=False requires safe-only data. \"\n                    \"This data appears to be pickle-serialized (legacy format). To deserialize it, set \"\n                    \"allow_pickle=True (not recommended for untrusted data).\"\n                )\n            # Warn about insecure pickle deserialization\n            import warnings\n\n            warnings.warn(\n                \"Deserializing pickle data. This is a security risk if the data is untrusted.\",\n                FutureWarning,\n                stacklevel=2,\n            )\n            return pickle.loads(base64.b64decode(data))\n\n    @staticmethod\n    def _extract_method_body(method) -> str:\n        \"\"\"Extract method body without the def line and dedent it.\"\"\"\n        import inspect\n        import textwrap\n\n        source = inspect.getsource(method)\n        lines = source.split(\"\\n\")\n        # Skip the def line and docstring\n        body_start = 0\n        for i, line in enumerate(lines):\n            if '\"\"\"' in line and i > 0:\n                # Find end of docstring\n                if line.count('\"\"\"') == 2:\n                    body_start = i + 1\n                    break\n                for j in range(i + 1, len(lines)):\n                    if '\"\"\"' in lines[j]:\n                        body_start = j + 1\n                        break\n                break\n            elif line.strip() and not line.strip().startswith(\"def \") and not line.strip().startswith(\"@\"):\n                body_start = i\n                break\n\n        body = \"\\n\".join(lines[body_start:])\n        return textwrap.dedent(body)\n\n    @staticmethod\n    def get_safe_serializer_code() -> str:\n        \"\"\"\n        Returns the SafeSerializer class definition as string for injection into sandbox.\n\n        This generates a standalone version from the actual implementation to avoid duplication.\n        \"\"\"\n        import inspect\n\n        # Generate to_json_safe from actual implementation\n        to_json_safe_source = inspect.getsource(SafeSerializer.to_json_safe)\n        # Make it standalone (remove @staticmethod, change self references)\n        to_json_safe_source = to_json_safe_source.replace(\"@staticmethod\\n    \", \"\")\n        to_json_safe_source = to_json_safe_source.replace(\"SafeSerializer.to_json_safe\", \"to_json_safe\")\n\n        # Generate from_json_safe from actual implementation\n        from_json_safe_source = inspect.getsource(SafeSerializer.from_json_safe)\n        from_json_safe_source = from_json_safe_source.replace(\"@staticmethod\\n    \", \"\")\n        from_json_safe_source = from_json_safe_source.replace(\"SafeSerializer.from_json_safe\", \"from_json_safe\")\n\n        return f'''\nclass SerializationError(Exception):\n    \"\"\"Raised when a type cannot be safely serialized.\"\"\"\n    pass\n\nclass SafeSerializer:\n    \"\"\"Safe JSON-based serializer for sandbox use.\"\"\"\n\n    SAFE_PREFIX = \"safe:\"\n\n    {to_json_safe_source}\n\n    {from_json_safe_source}\n\n    @staticmethod\n    def dumps(obj, allow_pickle=False):\n        import json\n        import base64\n        import pickle\n\n        if not allow_pickle:\n            # Safe ONLY - no pickle fallback\n            json_safe = to_json_safe(obj)  # Raises SerializationError if fails\n            return SafeSerializer.SAFE_PREFIX + json.dumps(json_safe)\n        else:\n            # Try safe first, fallback to pickle if allowed\n            try:\n                json_safe = to_json_safe(obj)\n                return SafeSerializer.SAFE_PREFIX + json.dumps(json_safe)\n            except SerializationError:\n                try:\n                    return \"pickle:\" + base64.b64encode(pickle.dumps(obj)).decode()\n                except (pickle.PicklingError, TypeError, AttributeError) as e:\n                    raise SerializationError(f\"Cannot serialize object: {{e}}\") from e\n\n    @staticmethod\n    def loads(data, allow_pickle=False):\n        import json\n        import base64\n        import pickle\n\n        if data.startswith(SafeSerializer.SAFE_PREFIX):\n            json_data = json.loads(data[len(SafeSerializer.SAFE_PREFIX):])\n            return from_json_safe(json_data)\n        elif data.startswith(\"pickle:\"):\n            if not allow_pickle:\n                raise SerializationError(\"Pickle data rejected: allow_pickle=False\")\n            return pickle.loads(base64.b64decode(data[7:]))\n        else:\n            # Legacy format (no prefix) - assume pickle\n            if not allow_pickle:\n                raise SerializationError(\"Pickle data rejected: allow_pickle=False\")\n            return pickle.loads(base64.b64decode(data))\n'''\n\n    @staticmethod\n    def get_deserializer_code(allow_pickle: bool) -> str:\n        \"\"\"\n        Generate deserializer function for remote execution with setting baked in.\n\n        This generates code from the actual implementation to avoid duplication.\n\n        Args:\n            allow_pickle: Whether to allow pickle deserialization\n\n        Returns:\n            Python code string with _deserialize function\n        \"\"\"\n        import inspect\n        import textwrap\n\n        # Build a standalone _from_json_safe function from the source of from_json_safe.\n        from_json_safe_source = inspect.getsource(SafeSerializer.from_json_safe)\n        from_json_safe_source = textwrap.dedent(from_json_safe_source)\n        if from_json_safe_source.startswith(\"@staticmethod\\n\"):\n            from_json_safe_source = from_json_safe_source[len(\"@staticmethod\\n\") :]\n        from_json_safe_source = from_json_safe_source.replace(\"def from_json_safe(\", \"def _from_json_safe(\")\n        from_json_safe_source = from_json_safe_source.replace(\"SafeSerializer.from_json_safe\", \"_from_json_safe\")\n\n        if allow_pickle:\n            prefixed_pickle_branch = [\n                \"        import pickle\",\n                \"        return pickle.loads(base64.b64decode(data[7:]))\",\n            ]\n            legacy_pickle_branch = [\n                \"        import pickle\",\n                \"        return pickle.loads(base64.b64decode(data))\",\n            ]\n        else:\n            prefixed_pickle_branch = [\n                '        raise SerializationError(\"Pickle data rejected: allow_pickle=False\")',\n            ]\n            legacy_pickle_branch = [\n                '        raise SerializationError(\"Pickle data rejected: allow_pickle=False\")',\n            ]\n\n        lines = [\n            \"import base64\",\n            \"from io import BytesIO\",\n            \"from typing import Any\",\n            \"\",\n            \"class SerializationError(Exception):\",\n            \"    pass\",\n            \"\",\n            from_json_safe_source.rstrip(),\n            \"\",\n            \"def _deserialize(data):\",\n            \"    import json\",\n            '    if isinstance(data, str) and data.startswith(\"safe:\"):',\n            \"        json_data = json.loads(data[5:])\",\n            \"        return _from_json_safe(json_data)\",\n            '    elif isinstance(data, str) and data.startswith(\"pickle:\"):',\n            *prefixed_pickle_branch,\n            \"    else:\",\n            \"        # No safe prefix - legacy format, assume pickle\",\n            *legacy_pickle_branch,\n            \"\",\n        ]\n        return \"\\n\".join(lines)\n"
  },
  {
    "path": "src/smolagents/tool_validation.py",
    "content": "import ast\nimport builtins\nfrom itertools import zip_longest\n\nfrom .utils import BASE_BUILTIN_MODULES, get_source, is_valid_name\n\n\n_BUILTIN_NAMES = set(vars(builtins))\n\n\nclass MethodChecker(ast.NodeVisitor):\n    \"\"\"\n    Checks that a method\n    - only uses defined names\n    - contains no local imports (e.g. numpy is ok but local_script is not)\n    \"\"\"\n\n    def __init__(self, class_attributes: set[str], check_imports: bool = True):\n        self.undefined_names = set()\n        self.imports = {}\n        self.from_imports = {}\n        self.assigned_names = set()\n        self.arg_names = set()\n        self.class_attributes = class_attributes\n        self.errors = []\n        self.check_imports = check_imports\n        self.typing_names = {\"Any\"}\n        self.defined_classes = set()\n\n    def visit_arguments(self, node):\n        \"\"\"Collect function arguments\"\"\"\n        self.arg_names = {arg.arg for arg in node.args}\n        if node.kwarg:\n            self.arg_names.add(node.kwarg.arg)\n        if node.vararg:\n            self.arg_names.add(node.vararg.arg)\n\n    def visit_Import(self, node):\n        for name in node.names:\n            actual_name = name.asname or name.name\n            self.imports[actual_name] = name.name\n\n    def visit_ImportFrom(self, node):\n        module = node.module or \"\"\n        for name in node.names:\n            actual_name = name.asname or name.name\n            self.from_imports[actual_name] = (module, name.name)\n\n    def visit_Assign(self, node):\n        for target in node.targets:\n            if isinstance(target, ast.Name):\n                self.assigned_names.add(target.id)\n            elif isinstance(target, (ast.Tuple, ast.List)):\n                for elt in target.elts:\n                    if isinstance(elt, ast.Name):\n                        self.assigned_names.add(elt.id)\n        self.visit(node.value)\n\n    def visit_With(self, node):\n        \"\"\"Track aliases in 'with' statements (the 'y' in 'with X as y')\"\"\"\n        for item in node.items:\n            if item.optional_vars:  # This is the 'y' in 'with X as y'\n                if isinstance(item.optional_vars, ast.Name):\n                    self.assigned_names.add(item.optional_vars.id)\n        self.generic_visit(node)\n\n    def visit_ExceptHandler(self, node):\n        \"\"\"Track exception aliases (the 'e' in 'except Exception as e')\"\"\"\n        if node.name:  # This is the 'e' in 'except Exception as e'\n            self.assigned_names.add(node.name)\n        self.generic_visit(node)\n\n    def visit_AnnAssign(self, node):\n        \"\"\"Track annotated assignments.\"\"\"\n        if isinstance(node.target, ast.Name):\n            self.assigned_names.add(node.target.id)\n        if node.value:\n            self.visit(node.value)\n\n    def visit_For(self, node):\n        target = node.target\n        if isinstance(target, ast.Name):\n            self.assigned_names.add(target.id)\n        elif isinstance(target, ast.Tuple):\n            for elt in target.elts:\n                if isinstance(elt, ast.Name):\n                    self.assigned_names.add(elt.id)\n        self.generic_visit(node)\n\n    def _handle_comprehension_generators(self, generators):\n        \"\"\"Helper method to handle generators in all types of comprehensions\"\"\"\n        for generator in generators:\n            if isinstance(generator.target, ast.Name):\n                self.assigned_names.add(generator.target.id)\n            elif isinstance(generator.target, ast.Tuple):\n                for elt in generator.target.elts:\n                    if isinstance(elt, ast.Name):\n                        self.assigned_names.add(elt.id)\n\n    def visit_ListComp(self, node):\n        \"\"\"Track variables in list comprehensions\"\"\"\n        self._handle_comprehension_generators(node.generators)\n        self.generic_visit(node)\n\n    def visit_DictComp(self, node):\n        \"\"\"Track variables in dictionary comprehensions\"\"\"\n        self._handle_comprehension_generators(node.generators)\n        self.generic_visit(node)\n\n    def visit_SetComp(self, node):\n        \"\"\"Track variables in set comprehensions\"\"\"\n        self._handle_comprehension_generators(node.generators)\n        self.generic_visit(node)\n\n    def visit_Attribute(self, node):\n        if not (isinstance(node.value, ast.Name) and node.value.id == \"self\"):\n            self.generic_visit(node)\n\n    def visit_ClassDef(self, node):\n        \"\"\"Track class definitions\"\"\"\n        self.defined_classes.add(node.name)\n        self.generic_visit(node)\n\n    def visit_Name(self, node):\n        if isinstance(node.ctx, ast.Load):\n            if not (\n                node.id in _BUILTIN_NAMES\n                or node.id in BASE_BUILTIN_MODULES\n                or node.id in self.arg_names\n                or node.id == \"self\"\n                or node.id in self.class_attributes\n                or node.id in self.imports\n                or node.id in self.from_imports\n                or node.id in self.assigned_names\n                or node.id in self.typing_names\n                or node.id in self.defined_classes\n            ):\n                self.errors.append(f\"Name '{node.id}' is undefined.\")\n\n    def visit_Call(self, node):\n        if isinstance(node.func, ast.Name):\n            if not (\n                node.func.id in _BUILTIN_NAMES\n                or node.func.id in BASE_BUILTIN_MODULES\n                or node.func.id in self.arg_names\n                or node.func.id == \"self\"\n                or node.func.id in self.class_attributes\n                or node.func.id in self.imports\n                or node.func.id in self.from_imports\n                or node.func.id in self.assigned_names\n                or node.func.id in self.defined_classes\n            ):\n                self.errors.append(f\"Name '{node.func.id}' is undefined.\")\n        self.generic_visit(node)\n\n\ndef validate_tool_attributes(cls, check_imports: bool = True) -> None:\n    \"\"\"\n    Validates that a Tool class follows the proper patterns:\n    0. Any argument of __init__ should have a default.\n    Args chosen at init are not traceable, so we cannot rebuild the source code for them, thus any important arg should be defined as a class attribute.\n    1. About the class:\n        - Class attributes should only be strings or dicts\n        - Class attributes cannot be complex attributes\n    2. About all class methods:\n        - Imports must be from packages, not local files\n        - All methods must be self-contained\n\n    Raises all errors encountered, if no error returns None.\n    \"\"\"\n\n    class ClassLevelChecker(ast.NodeVisitor):\n        def __init__(self):\n            self.imported_names = set()\n            self.complex_attributes = set()\n            self.class_attributes = set()\n            self.non_defaults = set()\n            self.non_literal_defaults = set()\n            self.in_method = False\n            self.invalid_attributes = []\n\n        def visit_FunctionDef(self, node):\n            if node.name == \"__init__\":\n                self._check_init_function_parameters(node)\n            old_context = self.in_method\n            self.in_method = True\n            self.generic_visit(node)\n            self.in_method = old_context\n\n        def visit_Assign(self, node):\n            if self.in_method:\n                return\n            # Track class attributes\n            for target in node.targets:\n                if isinstance(target, ast.Name):\n                    self.class_attributes.add(target.id)\n\n            # Check if the assignment is more complex than simple literals\n            if not all(isinstance(val, (ast.Constant, ast.Dict, ast.List, ast.Set)) for val in ast.walk(node.value)):\n                for target in node.targets:\n                    if isinstance(target, ast.Name):\n                        self.complex_attributes.add(target.id)\n\n            # Check specific class attributes\n            if getattr(node.targets[0], \"id\", \"\") == \"name\":\n                if not isinstance(node.value, ast.Constant):\n                    self.invalid_attributes.append(f\"Class attribute 'name' must be a constant, found '{node.value}'\")\n                elif not isinstance(node.value.value, str):\n                    self.invalid_attributes.append(\n                        f\"Class attribute 'name' must be a string, found '{node.value.value}'\"\n                    )\n                elif not is_valid_name(node.value.value):\n                    self.invalid_attributes.append(\n                        f\"Class attribute 'name' must be a valid Python identifier and not a reserved keyword, found '{node.value.value}'\"\n                    )\n\n        def _check_init_function_parameters(self, node):\n            # Check defaults in parameters\n            for arg, default in reversed(list(zip_longest(reversed(node.args.args), reversed(node.args.defaults)))):\n                if default is None:\n                    if arg.arg != \"self\":\n                        self.non_defaults.add(arg.arg)\n                elif not isinstance(default, (ast.Constant, ast.Dict, ast.List, ast.Set)):\n                    self.non_literal_defaults.add(arg.arg)\n\n    class_level_checker = ClassLevelChecker()\n    source = get_source(cls)\n    tree = ast.parse(source)\n    class_node = tree.body[0]\n    if not isinstance(class_node, ast.ClassDef):\n        raise ValueError(\"Source code must define a class\")\n    class_level_checker.visit(class_node)\n\n    errors = []\n    # Check invalid class attributes\n    if class_level_checker.invalid_attributes:\n        errors += class_level_checker.invalid_attributes\n    if class_level_checker.complex_attributes:\n        errors.append(\n            f\"Complex attributes should be defined in __init__, not as class attributes: \"\n            f\"{', '.join(class_level_checker.complex_attributes)}\"\n        )\n    if class_level_checker.non_defaults:\n        errors.append(\n            f\"Parameters in __init__ must have default values, found required parameters: \"\n            f\"{', '.join(class_level_checker.non_defaults)}\"\n        )\n    if class_level_checker.non_literal_defaults:\n        errors.append(\n            f\"Parameters in __init__ must have literal default values, found non-literal defaults: \"\n            f\"{', '.join(class_level_checker.non_literal_defaults)}\"\n        )\n\n    # Run checks on all methods\n    for node in class_node.body:\n        if isinstance(node, ast.FunctionDef):\n            method_checker = MethodChecker(class_level_checker.class_attributes, check_imports=check_imports)\n            method_checker.visit(node)\n            errors += [f\"- {node.name}: {error}\" for error in method_checker.errors]\n\n    if errors:\n        raise ValueError(f\"Tool validation failed for {cls.__name__}:\\n\" + \"\\n\".join(errors))\n    return\n"
  },
  {
    "path": "src/smolagents/tools.py",
    "content": "#!/usr/bin/env python\n# coding=utf-8\n\n# Copyright 2024 The HuggingFace Inc. team. All rights reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\nfrom __future__ import annotations\n\nimport ast\nimport inspect\nimport json\nimport logging\nimport os\nimport sys\nimport tempfile\nimport textwrap\nimport types\nimport warnings\nfrom abc import ABC, abstractmethod\nfrom collections.abc import Callable\nfrom contextlib import contextmanager\nfrom functools import wraps\nfrom pathlib import Path\nfrom typing import TYPE_CHECKING, Any\n\nfrom huggingface_hub import (\n    CommitOperationAdd,\n    create_commit,\n    create_repo,\n    get_collection,\n    hf_hub_download,\n    metadata_update,\n)\n\nfrom ._function_type_hints_utils import (\n    TypeHintParsingException,\n    _convert_type_hints_to_json_schema,\n    _get_json_schema_type,\n    get_imports,\n    get_json_schema,\n)\nfrom .agent_types import AgentAudio, AgentImage, handle_agent_input_types, handle_agent_output_types\nfrom .tool_validation import MethodChecker, validate_tool_attributes\nfrom .utils import (\n    BASE_BUILTIN_MODULES,\n    _is_package_available,\n    get_source,\n    instance_to_source,\n    is_valid_name,\n)\n\n\nif TYPE_CHECKING:\n    import mcp\n\n\nlogger = logging.getLogger(__name__)\n\n\ndef validate_after_init(cls):\n    original_init = cls.__init__\n\n    @wraps(original_init)\n    def new_init(self, *args, **kwargs):\n        original_init(self, *args, **kwargs)\n        self.validate_arguments()\n\n    cls.__init__ = new_init\n    return cls\n\n\nAUTHORIZED_TYPES = [\n    \"string\",\n    \"boolean\",\n    \"integer\",\n    \"number\",\n    \"image\",\n    \"audio\",\n    \"array\",\n    \"object\",\n    \"any\",\n    \"null\",\n]\n\nCONVERSION_DICT = {\"str\": \"string\", \"int\": \"integer\", \"float\": \"number\"}\n\n\nclass BaseTool(ABC):\n    name: str\n\n    @abstractmethod\n    def __call__(self, *args, **kwargs) -> Any:\n        pass\n\n\nclass Tool(BaseTool):\n    \"\"\"\n    A base class for the functions used by the agent. Subclass this and implement the `forward` method as well as the\n    following class attributes:\n\n    - **description** (`str`) -- A short description of what your tool does, the inputs it expects and the output(s) it\n      will return. For instance 'This is a tool that downloads a file from a `url`. It takes the `url` as input, and\n      returns the text contained in the file'.\n    - **name** (`str`) -- A performative name that will be used for your tool in the prompt to the agent. For instance\n      `\"text-classifier\"` or `\"image_generator\"`.\n    - **inputs** (`Dict[str, Dict[str, Union[str, type, bool]]]`) -- The dict of modalities expected for the inputs.\n      It has one `type`key and a `description`key.\n      This is used by `launch_gradio_demo` or to make a nice space from your tool, and also can be used in the generated\n      description for your tool.\n    - **output_type** (`type`) -- The type of the tool output. This is used by `launch_gradio_demo`\n      or to make a nice space from your tool, and also can be used in the generated description for your tool.\n    - **output_schema** (`Dict[str, Any]`, *optional*) -- The JSON schema defining the expected structure of the tool output.\n      This can be included in system prompts to help agents understand the expected output format. Note: This is currently\n      used for informational purposes only and does not perform actual output validation.\n\n    You can also override the method [`~Tool.setup`] if your tool has an expensive operation to perform before being\n    usable (such as loading a model). [`~Tool.setup`] will be called the first time you use your tool, but not at\n    instantiation.\n    \"\"\"\n\n    name: str\n    description: str\n    inputs: dict[str, dict[str, str | type | bool]]\n    output_type: str\n    output_schema: dict[str, Any] | None = None\n\n    def __init__(self, *args, **kwargs):\n        self.is_initialized = False\n\n    def __init_subclass__(cls, **kwargs):\n        super().__init_subclass__(**kwargs)\n        validate_after_init(cls)\n\n    def validate_arguments(self):\n        required_attributes = {\n            \"description\": str,\n            \"name\": str,\n            \"inputs\": dict,\n            \"output_type\": str,\n        }\n        # Validate class attributes\n        for attr, expected_type in required_attributes.items():\n            attr_value = getattr(self, attr, None)\n            if attr_value is None:\n                raise TypeError(f\"You must set an attribute {attr}.\")\n            if not isinstance(attr_value, expected_type):\n                raise TypeError(\n                    f\"Attribute {attr} should have type {expected_type.__name__}, got {type(attr_value)} instead.\"\n                )\n\n        # Validate optional output_schema attribute\n        output_schema = getattr(self, \"output_schema\", None)\n        if output_schema is not None and not isinstance(output_schema, dict):\n            raise TypeError(f\"Attribute output_schema should have type dict, got {type(output_schema)} instead.\")\n\n        # - Validate name\n        if not is_valid_name(self.name):\n            raise Exception(\n                f\"Invalid Tool name '{self.name}': must be a valid Python identifier and not a reserved keyword\"\n            )\n        # Validate inputs\n        for input_name, input_content in self.inputs.items():\n            assert isinstance(input_content, dict), f\"Input '{input_name}' should be a dictionary.\"\n            assert \"type\" in input_content and \"description\" in input_content, (\n                f\"Input '{input_name}' should have keys 'type' and 'description', has only {list(input_content.keys())}.\"\n            )\n            # Get input_types as a list, whether from a string or list\n            if isinstance(input_content[\"type\"], str):\n                input_types = [input_content[\"type\"]]\n            elif isinstance(input_content[\"type\"], list):\n                input_types = input_content[\"type\"]\n                # Check if all elements are strings\n                if not all(isinstance(t, str) for t in input_types):\n                    raise TypeError(\n                        f\"Input '{input_name}': when type is a list, all elements must be strings, got {input_content['type']}\"\n                    )\n            else:\n                raise TypeError(\n                    f\"Input '{input_name}': type must be a string or list of strings, got {type(input_content['type']).__name__}\"\n                )\n            # Check all types are authorized\n            invalid_types = [t for t in input_types if t not in AUTHORIZED_TYPES]\n            if invalid_types:\n                raise ValueError(f\"Input '{input_name}': types {invalid_types} must be one of {AUTHORIZED_TYPES}\")\n        # Validate output type\n        assert getattr(self, \"output_type\", None) in AUTHORIZED_TYPES\n\n        # Validate forward function signature, except for Tools that use a \"generic\" signature (PipelineTool, SpaceToolWrapper, LangChainToolWrapper)\n        if not (\n            hasattr(self, \"skip_forward_signature_validation\")\n            and getattr(self, \"skip_forward_signature_validation\") is True\n        ):\n            signature = inspect.signature(self.forward)\n            actual_keys = set(key for key in signature.parameters.keys() if key != \"self\")\n            expected_keys = set(self.inputs.keys())\n            if actual_keys != expected_keys:\n                raise Exception(\n                    f\"In tool '{self.name}', 'forward' method parameters were {actual_keys}, but expected {expected_keys}. \"\n                    f\"It should take 'self' as its first argument, then its next arguments should match the keys of tool attribute 'inputs'.\"\n                )\n\n            json_schema = _convert_type_hints_to_json_schema(self.forward, error_on_missing_type_hints=False)[\n                \"properties\"\n            ]  # This function will not raise an error on missing docstrings, contrary to get_json_schema\n            for key, value in self.inputs.items():\n                assert key in json_schema, (\n                    f\"Input '{key}' should be present in function signature, found only {json_schema.keys()}\"\n                )\n                if \"nullable\" in value:\n                    assert \"nullable\" in json_schema[key], (\n                        f\"Nullable argument '{key}' in inputs should have key 'nullable' set to True in function signature.\"\n                    )\n                if key in json_schema and \"nullable\" in json_schema[key]:\n                    assert \"nullable\" in value, (\n                        f\"Nullable argument '{key}' in function signature should have key 'nullable' set to True in inputs.\"\n                    )\n\n    def forward(self, *args, **kwargs):\n        raise NotImplementedError(\"Write this method in your subclass of `Tool`.\")\n\n    def __call__(self, *args, sanitize_inputs_outputs: bool = False, **kwargs):\n        if not self.is_initialized:\n            self.setup()\n\n        # Handle the arguments might be passed as a single dictionary\n        if len(args) == 1 and len(kwargs) == 0 and isinstance(args[0], dict):\n            potential_kwargs = args[0]\n\n            # If the dictionary keys match our input parameters, convert it to kwargs\n            if all(key in self.inputs for key in potential_kwargs):\n                args = ()\n                kwargs = potential_kwargs\n\n        if sanitize_inputs_outputs:\n            args, kwargs = handle_agent_input_types(*args, **kwargs)\n        outputs = self.forward(*args, **kwargs)\n        if sanitize_inputs_outputs:\n            outputs = handle_agent_output_types(outputs, self.output_type)\n        return outputs\n\n    def setup(self):\n        \"\"\"\n        Overwrite this method here for any operation that is expensive and needs to be executed before you start using\n        your tool. Such as loading a big model.\n        \"\"\"\n        self.is_initialized = True\n\n    def to_code_prompt(self) -> str:\n        args_signature = \", \".join(f\"{arg_name}: {arg_schema['type']}\" for arg_name, arg_schema in self.inputs.items())\n\n        # Use dict type for tools with output schema to indicate structured return\n        has_schema = hasattr(self, \"output_schema\") and self.output_schema is not None\n        output_type = \"dict\" if has_schema else self.output_type\n        tool_signature = f\"({args_signature}) -> {output_type}\"\n        tool_doc = self.description\n\n        # Add an important note for smaller models (e.g. Mistral Small, Gemma 3, etc.) to properly handle structured output.\n        if has_schema:\n            tool_doc += \"\\n\\nImportant: This tool returns structured output! Use the JSON schema below to directly access fields like result['field_name']. NO print() statements needed to inspect the output!\"\n\n        # Add arguments documentation\n        if self.inputs:\n            args_descriptions = \"\\n\".join(\n                f\"{arg_name}: {arg_schema['description']}\" for arg_name, arg_schema in self.inputs.items()\n            )\n            args_doc = f\"Args:\\n{textwrap.indent(args_descriptions, '    ')}\"\n            tool_doc += f\"\\n\\n{args_doc}\"\n\n        # Add returns documentation with output schema if it exists\n        if has_schema:\n            formatted_schema = json.dumps(self.output_schema, indent=4)\n            indented_schema = textwrap.indent(formatted_schema, \"        \")\n            returns_doc = f\"\\nReturns:\\n    dict (structured output): This tool ALWAYS returns a dictionary that strictly adheres to the following JSON schema:\\n{indented_schema}\"\n            tool_doc += f\"\\n{returns_doc}\"\n\n        tool_doc = f'\"\"\"{tool_doc}\\n\"\"\"'\n        return f\"def {self.name}{tool_signature}:\\n{textwrap.indent(tool_doc, '    ')}\"\n\n    def to_tool_calling_prompt(self) -> str:\n        return f\"{self.name}: {self.description}\\n    Takes inputs: {self.inputs}\\n    Returns an output of type: {self.output_type}\"\n\n    def to_dict(self) -> dict:\n        \"\"\"Returns a dictionary representing the tool\"\"\"\n        class_name = self.__class__.__name__\n        if type(self).__name__ == \"SimpleTool\":\n            # Check that imports are self-contained\n            source_code = get_source(self.forward).replace(\"@tool\", \"\")\n            forward_node = ast.parse(source_code)\n            # If tool was created using '@tool' decorator, it has only a forward pass, so it's simpler to just get its code\n            method_checker = MethodChecker(set())\n            method_checker.visit(forward_node)\n\n            if len(method_checker.errors) > 0:\n                errors = [f\"- {error}\" for error in method_checker.errors]\n                raise (ValueError(f\"SimpleTool validation failed for {self.name}:\\n\" + \"\\n\".join(errors)))\n\n            forward_source_code = get_source(self.forward)\n            tool_code = textwrap.dedent(\n                f\"\"\"\n            from smolagents import Tool\n            from typing import Any, Optional\n\n            class {class_name}(Tool):\n                name = \"{self.name}\"\n                description = {json.dumps(textwrap.dedent(self.description).strip())}\n                inputs = {repr(self.inputs)}\n                output_type = \"{self.output_type}\"\n            \"\"\"\n            ).strip()\n\n            # Add output_schema if it exists\n            if hasattr(self, \"output_schema\") and self.output_schema is not None:\n                tool_code += f\"\\n                output_schema = {repr(self.output_schema)}\"\n            import re\n\n            def add_self_argument(source_code: str) -> str:\n                \"\"\"Add 'self' as first argument to a function definition if not present.\"\"\"\n                pattern = r\"def forward\\(((?!self)[^)]*)\\)\"\n\n                def replacement(match):\n                    args = match.group(1).strip()\n                    if args:  # If there are other arguments\n                        return f\"def forward(self, {args})\"\n                    return \"def forward(self)\"\n\n                return re.sub(pattern, replacement, source_code)\n\n            forward_source_code = forward_source_code.replace(self.name, \"forward\")\n            forward_source_code = add_self_argument(forward_source_code)\n            forward_source_code = forward_source_code.replace(\"@tool\", \"\").strip()\n            tool_code += \"\\n\\n\" + textwrap.indent(forward_source_code, \"    \")\n\n        else:  # If the tool was not created by the @tool decorator, it was made by subclassing Tool\n            if type(self).__name__ in [\n                \"SpaceToolWrapper\",\n                \"LangChainToolWrapper\",\n                \"GradioToolWrapper\",\n            ]:\n                raise ValueError(\n                    \"Cannot save objects created with from_space, from_langchain or from_gradio, as this would create errors.\"\n                )\n\n            validate_tool_attributes(self.__class__)\n\n            tool_code = \"from typing import Any, Optional\\n\" + instance_to_source(self, base_cls=Tool)\n\n        requirements = {el for el in get_imports(tool_code) if el not in sys.stdlib_module_names} | {\"smolagents\"}\n\n        tool_dict = {\"name\": self.name, \"code\": tool_code, \"requirements\": sorted(requirements)}\n\n        # Add output_schema if it exists\n        if hasattr(self, \"output_schema\") and self.output_schema is not None:\n            tool_dict[\"output_schema\"] = self.output_schema\n\n        return tool_dict\n\n    @classmethod\n    def from_dict(cls, tool_dict: dict[str, Any], **kwargs) -> \"Tool\":\n        \"\"\"\n        Create tool from a dictionary representation.\n\n        Args:\n            tool_dict (`dict[str, Any]`): Dictionary representation of the tool.\n            **kwargs: Additional keyword arguments to pass to the tool's constructor.\n\n        Returns:\n            `Tool`: Tool object.\n        \"\"\"\n        if \"code\" not in tool_dict:\n            raise ValueError(\"Tool dictionary must contain 'code' key with the tool source code\")\n\n        tool = cls.from_code(tool_dict[\"code\"], **kwargs)\n\n        # Set output_schema if it exists in the dictionary\n        if \"output_schema\" in tool_dict:\n            tool.output_schema = tool_dict[\"output_schema\"]\n\n        return tool\n\n    def save(self, output_dir: str | Path, tool_file_name: str = \"tool\", make_gradio_app: bool = True):\n        \"\"\"\n        Saves the relevant code files for your tool so it can be pushed to the Hub. This will copy the code of your\n        tool in `output_dir` as well as autogenerate:\n\n        - a `{tool_file_name}.py` file containing the logic for your tool.\n        If you pass `make_gradio_app=True`, this will also write:\n        - an `app.py` file providing a UI for your tool when it is exported to a Space with `tool.push_to_hub()`\n        - a `requirements.txt` containing the names of the modules used by your tool (as detected when inspecting its\n          code)\n\n        Args:\n            output_dir (`str` or `Path`): The folder in which you want to save your tool.\n            tool_file_name (`str`, *optional*): The file name in which you want to save your tool.\n            make_gradio_app (`bool`, *optional*, defaults to True): Whether to also export a `requirements.txt` file and Gradio UI.\n        \"\"\"\n        # Ensure output directory exists\n        output_path = Path(output_dir)\n        output_path.mkdir(parents=True, exist_ok=True)\n        # Save tool file\n        self._write_file(output_path / f\"{tool_file_name}.py\", self._get_tool_code())\n        if make_gradio_app:\n            #  Save app file\n            self._write_file(output_path / \"app.py\", self._get_gradio_app_code(tool_module_name=tool_file_name))\n            # Save requirements file\n            self._write_file(output_path / \"requirements.txt\", self._get_requirements())\n\n    def _write_file(self, file_path: Path, content: str) -> None:\n        \"\"\"Writes content to a file with UTF-8 encoding.\"\"\"\n        file_path.write_text(content, encoding=\"utf-8\")\n\n    def push_to_hub(\n        self,\n        repo_id: str,\n        commit_message: str = \"Upload tool\",\n        private: bool | None = None,\n        token: bool | str | None = None,\n        create_pr: bool = False,\n    ) -> str:\n        \"\"\"\n        Upload the tool to the Hub.\n\n        Parameters:\n            repo_id (`str`):\n                The name of the repository you want to push your tool to. It should contain your organization name when\n                pushing to a given organization.\n            commit_message (`str`, *optional*, defaults to `\"Upload tool\"`):\n                Message to commit while pushing.\n            private (`bool`, *optional*):\n                Whether to make the repo private. If `None` (default), the repo will be public unless the organization's default is private. This value is ignored if the repo already exists.\n            token (`bool` or `str`, *optional*):\n                The token to use as HTTP bearer authorization for remote files. If unset, will use the token generated\n                when running `huggingface-cli login` (stored in `~/.huggingface`).\n            create_pr (`bool`, *optional*, defaults to `False`):\n                Whether to create a PR with the uploaded files or directly commit.\n        \"\"\"\n        # Initialize repository\n        repo_id = self._initialize_hub_repo(repo_id, token, private)\n        # Prepare files for commit\n        additions = self._prepare_hub_files()\n        # Create commit\n        return create_commit(\n            repo_id=repo_id,\n            operations=additions,\n            commit_message=commit_message,\n            token=token,\n            create_pr=create_pr,\n            repo_type=\"space\",\n        )\n\n    @staticmethod\n    def _initialize_hub_repo(repo_id: str, token: bool | str | None, private: bool | None) -> str:\n        \"\"\"Initialize repository on Hugging Face Hub.\"\"\"\n        repo_url = create_repo(\n            repo_id=repo_id,\n            token=token,\n            private=private,\n            exist_ok=True,\n            repo_type=\"space\",\n            space_sdk=\"gradio\",\n        )\n        metadata_update(repo_url.repo_id, {\"tags\": [\"smolagents\", \"tool\"]}, repo_type=\"space\", token=token)\n        return repo_url.repo_id\n\n    def _prepare_hub_files(self) -> list:\n        \"\"\"Prepare files for Hub commit.\"\"\"\n        additions = [\n            # Add tool code\n            CommitOperationAdd(\n                path_in_repo=\"tool.py\",\n                path_or_fileobj=self._get_tool_code().encode(),\n            ),\n            # Add Gradio app\n            CommitOperationAdd(\n                path_in_repo=\"app.py\",\n                path_or_fileobj=self._get_gradio_app_code().encode(),\n            ),\n            # Add requirements\n            CommitOperationAdd(\n                path_in_repo=\"requirements.txt\",\n                path_or_fileobj=self._get_requirements().encode(),\n            ),\n        ]\n        return additions\n\n    def _get_tool_code(self) -> str:\n        \"\"\"Get the tool's code.\"\"\"\n        return self.to_dict()[\"code\"]\n\n    def _get_gradio_app_code(self, tool_module_name: str = \"tool\") -> str:\n        \"\"\"Get the Gradio app code.\"\"\"\n        class_name = self.__class__.__name__\n        return textwrap.dedent(\n            f\"\"\"\\\n            from smolagents import launch_gradio_demo\n            from {tool_module_name} import {class_name}\n\n            tool = {class_name}()\n            launch_gradio_demo(tool)\n            \"\"\"\n        )\n\n    def _get_requirements(self) -> str:\n        \"\"\"Get the requirements.\"\"\"\n        return \"\\n\".join(self.to_dict()[\"requirements\"])\n\n    @classmethod\n    def from_hub(\n        cls,\n        repo_id: str,\n        token: str | None = None,\n        trust_remote_code: bool = False,\n        **kwargs,\n    ):\n        \"\"\"\n        Loads a tool defined on the Hub.\n\n        <Tip warning={true}>\n\n        Loading a tool from the Hub means that you'll download the tool and execute it locally.\n        ALWAYS inspect the tool you're downloading before loading it within your runtime, as you would do when\n        installing a package using pip/npm/apt.\n\n        </Tip>\n\n        Args:\n            repo_id (`str`):\n                The name of the Space repo on the Hub where your tool is defined.\n            token (`str`, *optional*):\n                The token to identify you on hf.co. If unset, will use the token generated when running\n                `huggingface-cli login` (stored in `~/.huggingface`).\n            trust_remote_code(`str`, *optional*, defaults to False):\n                This flags marks that you understand the risk of running remote code and that you trust this tool.\n                If not setting this to True, loading the tool from Hub will fail.\n            kwargs (additional keyword arguments, *optional*):\n                Additional keyword arguments that will be split in two: all arguments relevant to the Hub (such as\n                `cache_dir`, `revision`, `subfolder`) will be used when downloading the files for your tool, and the\n                others will be passed along to its init.\n        \"\"\"\n        if not trust_remote_code:\n            raise ValueError(\n                \"Loading a tool from Hub requires to acknowledge you trust its code: to do so, pass `trust_remote_code=True`.\"\n            )\n\n        # Get the tool's tool.py file.\n        tool_file = hf_hub_download(\n            repo_id,\n            \"tool.py\",\n            token=token,\n            repo_type=\"space\",\n            cache_dir=kwargs.get(\"cache_dir\"),\n            force_download=kwargs.get(\"force_download\"),\n            proxies=kwargs.get(\"proxies\"),\n            revision=kwargs.get(\"revision\"),\n            subfolder=kwargs.get(\"subfolder\"),\n            local_files_only=kwargs.get(\"local_files_only\"),\n        )\n\n        tool_code = Path(tool_file).read_text()\n        return Tool.from_code(tool_code, **kwargs)\n\n    @classmethod\n    def from_code(cls, tool_code: str, **kwargs):\n        module = types.ModuleType(\"dynamic_tool\")\n\n        exec(tool_code, module.__dict__)\n\n        # Find the Tool subclass\n        tool_class = next(\n            (\n                obj\n                for _, obj in inspect.getmembers(module, inspect.isclass)\n                if issubclass(obj, Tool) and obj is not Tool\n            ),\n            None,\n        )\n\n        if tool_class is None:\n            raise ValueError(\"No Tool subclass found in the code.\")\n\n        if not isinstance(tool_class.inputs, dict):\n            tool_class.inputs = ast.literal_eval(tool_class.inputs)\n\n        # Handle output_schema if it exists and is a string representation\n        if hasattr(tool_class, \"output_schema\") and isinstance(tool_class.output_schema, str):\n            tool_class.output_schema = ast.literal_eval(tool_class.output_schema)\n\n        return tool_class(**kwargs)\n\n    @staticmethod\n    def from_space(\n        space_id: str,\n        name: str,\n        description: str = \"\",\n        api_name: str | None = None,\n        token: str | None = None,\n    ):\n        \"\"\"\n        Creates a [`Tool`] from a Space given its id on the Hub.\n\n        Args:\n            space_id (`str`):\n                The id of the Space on the Hub.\n            name (`str`):\n                The name of the tool.\n            description (`str`):\n                The description of the tool.\n            api_name (`str`, *optional*):\n                The specific api_name to use, if the space has several tabs. If not precised, will default to the first available api.\n            token (`str`, *optional*):\n                Add your token to access private spaces or increase your GPU quotas.\n        Returns:\n            [`Tool`]:\n                The Space, as a tool.\n\n        Examples:\n        ```py\n        >>> image_generator = Tool.from_space(\n        ...     space_id=\"black-forest-labs/FLUX.1-schnell\",\n        ...     name=\"image-generator\",\n        ...     description=\"Generate an image from a prompt\"\n        ... )\n        >>> image = image_generator(\"Generate an image of a cool surfer in Tahiti\")\n        ```\n        ```py\n        >>> face_swapper = Tool.from_space(\n        ...     \"tuan2308/face-swap\",\n        ...     \"face_swapper\",\n        ...     \"Tool that puts the face shown on the first image on the second image. You can give it paths to images.\",\n        ... )\n        >>> image = face_swapper('./aymeric.jpeg', './ruth.jpg')\n        ```\n        \"\"\"\n        from gradio_client import Client, handle_file\n\n        class SpaceToolWrapper(Tool):\n            skip_forward_signature_validation = True\n\n            def __init__(\n                self,\n                space_id: str,\n                name: str,\n                description: str = \"\",\n                api_name: str | None = None,\n                token: str | None = None,\n            ):\n                self.name = name\n                self.description = description\n                self.client = Client(space_id, hf_token=token)\n                space_api = self.client.view_api(return_format=\"dict\", print_info=False)\n                assert isinstance(space_api, dict)\n                space_description = space_api[\"named_endpoints\"]\n\n                # If api_name is not defined, take the first of the available APIs for this space\n                if api_name is None:\n                    api_name = list(space_description.keys())[0]\n                    warnings.warn(\n                        f\"Since `api_name` was not defined, it was automatically set to the first available API: `{api_name}`.\"\n                    )\n                self.api_name = api_name\n\n                try:\n                    space_description_api = space_description[api_name]\n                except KeyError:\n                    raise KeyError(f\"Could not find specified {api_name=} among available api names.\")\n                self.inputs = {}\n                for parameter in space_description_api[\"parameters\"]:\n                    parameter_type = parameter[\"type\"][\"type\"]\n                    if parameter_type == \"object\":\n                        parameter_type = \"any\"\n                    self.inputs[parameter[\"parameter_name\"]] = {\n                        \"type\": parameter_type,\n                        \"description\": parameter[\"python_type\"][\"description\"],\n                        \"nullable\": parameter[\"parameter_has_default\"],\n                    }\n                output_component = space_description_api[\"returns\"][0][\"component\"]\n                if output_component == \"Image\":\n                    self.output_type = \"image\"\n                elif output_component == \"Audio\":\n                    self.output_type = \"audio\"\n                else:\n                    self.output_type = \"any\"\n                self.is_initialized = True\n\n            def sanitize_argument_for_prediction(self, arg):\n                from gradio_client.utils import is_http_url_like\n                from PIL.Image import Image\n\n                if isinstance(arg, Image):\n                    temp_file = tempfile.NamedTemporaryFile(suffix=\".png\", delete=False)\n                    arg.save(temp_file.name)\n                    arg = temp_file.name\n                if (\n                    (isinstance(arg, str) and os.path.isfile(arg))\n                    or (isinstance(arg, Path) and arg.exists() and arg.is_file())\n                    or is_http_url_like(arg)\n                ):\n                    arg = handle_file(arg)\n                return arg\n\n            def forward(self, *args, **kwargs):\n                # Preprocess args and kwargs:\n                args = list(args)\n                for i, arg in enumerate(args):\n                    args[i] = self.sanitize_argument_for_prediction(arg)\n                for arg_name, arg in kwargs.items():\n                    kwargs[arg_name] = self.sanitize_argument_for_prediction(arg)\n\n                output = self.client.predict(*args, api_name=self.api_name, **kwargs)\n                if isinstance(output, tuple) or isinstance(output, list):\n                    if isinstance(output[1], str):\n                        raise ValueError(\"The space returned this message: \" + output[1])\n                    output = output[\n                        0\n                    ]  # Sometime the space also returns the generation seed, in which case the result is at index 0\n                IMAGE_EXTENTIONS = [\".png\", \".jpg\", \".jpeg\", \".gif\", \".webp\"]\n                AUDIO_EXTENTIONS = [\".mp3\", \".wav\", \".ogg\", \".m4a\", \".flac\"]\n                if isinstance(output, str) and any([output.endswith(ext) for ext in IMAGE_EXTENTIONS]):\n                    output = AgentImage(output)\n                elif isinstance(output, str) and any([output.endswith(ext) for ext in AUDIO_EXTENTIONS]):\n                    output = AgentAudio(output)\n                return output\n\n        return SpaceToolWrapper(\n            space_id=space_id,\n            name=name,\n            description=description,\n            api_name=api_name,\n            token=token,\n        )\n\n    @staticmethod\n    def from_gradio(gradio_tool):\n        \"\"\"\n        Creates a [`Tool`] from a gradio tool.\n        \"\"\"\n        import inspect\n\n        class GradioToolWrapper(Tool):\n            def __init__(self, _gradio_tool):\n                self.name = _gradio_tool.name\n                self.description = _gradio_tool.description\n                self.output_type = \"string\"\n                self._gradio_tool = _gradio_tool\n                func_args = list(inspect.signature(_gradio_tool.run).parameters.items())\n                self.inputs = {\n                    key: {\"type\": CONVERSION_DICT[value.annotation], \"description\": \"\"} for key, value in func_args\n                }\n                self.forward = self._gradio_tool.run\n\n        return GradioToolWrapper(gradio_tool)\n\n    @staticmethod\n    def from_langchain(langchain_tool):\n        \"\"\"\n        Creates a [`Tool`] from a langchain tool.\n        \"\"\"\n\n        class LangChainToolWrapper(Tool):\n            skip_forward_signature_validation = True\n\n            def __init__(self, _langchain_tool):\n                self.name = _langchain_tool.name.lower()\n                self.description = _langchain_tool.description\n                self.inputs = _langchain_tool.args.copy()\n                for input_content in self.inputs.values():\n                    if \"title\" in input_content:\n                        input_content.pop(\"title\")\n                    input_content[\"description\"] = \"\"\n                self.output_type = \"string\"\n                self.langchain_tool = _langchain_tool\n                self.is_initialized = True\n\n            def forward(self, *args, **kwargs):\n                tool_input = kwargs.copy()\n                for index, argument in enumerate(args):\n                    if index < len(self.inputs):\n                        input_key = next(iter(self.inputs))\n                        tool_input[input_key] = argument\n                return self.langchain_tool.run(tool_input)\n\n        return LangChainToolWrapper(langchain_tool)\n\n\ndef launch_gradio_demo(tool: Tool):\n    \"\"\"\n    Launches a gradio demo for a tool. The corresponding tool class needs to properly implement the class attributes\n    `inputs` and `output_type`.\n\n    Args:\n        tool (`Tool`): The tool for which to launch the demo.\n    \"\"\"\n    try:\n        import gradio as gr\n    except ImportError:\n        raise ImportError(\"Gradio should be installed in order to launch a gradio demo.\")\n\n    TYPE_TO_COMPONENT_CLASS_MAPPING = {\n        \"boolean\": gr.Checkbox,\n        \"image\": gr.Image,\n        \"audio\": gr.Audio,\n        \"string\": gr.Textbox,\n        \"integer\": gr.Number,\n        \"number\": gr.Number,\n    }\n\n    def tool_forward(*args, **kwargs):\n        return tool(*args, sanitize_inputs_outputs=True, **kwargs)\n\n    tool_forward.__signature__ = inspect.signature(tool.forward)\n\n    gradio_inputs = []\n    for input_name, input_details in tool.inputs.items():\n        input_gradio_component_class = TYPE_TO_COMPONENT_CLASS_MAPPING[input_details[\"type\"]]\n        new_component = input_gradio_component_class(label=input_name)\n        gradio_inputs.append(new_component)\n\n    output_gradio_component_class = TYPE_TO_COMPONENT_CLASS_MAPPING[tool.output_type]\n    gradio_output = output_gradio_component_class(label=\"Output\")\n\n    gr.Interface(\n        fn=tool_forward,\n        inputs=gradio_inputs,\n        outputs=gradio_output,\n        title=tool.name,\n        description=tool.description,\n        api_name=tool.name,\n    ).launch()\n\n\ndef load_tool(\n    repo_id,\n    model_repo_id: str | None = None,\n    token: str | None = None,\n    trust_remote_code: bool = False,\n    **kwargs,\n):\n    \"\"\"\n    Main function to quickly load a tool from the Hub.\n\n    <Tip warning={true}>\n\n    Loading a tool means that you'll download the tool and execute it locally.\n    ALWAYS inspect the tool you're downloading before loading it within your runtime, as you would do when\n    installing a package using pip/npm/apt.\n\n    </Tip>\n\n    Args:\n        repo_id (`str`):\n            Space repo ID of a tool on the Hub.\n        model_repo_id (`str`, *optional*):\n            Use this argument to use a different model than the default one for the tool you selected.\n        token (`str`, *optional*):\n            The token to identify you on hf.co. If unset, will use the token generated when running `huggingface-cli\n            login` (stored in `~/.huggingface`).\n        trust_remote_code (`bool`, *optional*, defaults to False):\n            This needs to be accepted in order to load a tool from Hub.\n        kwargs (additional keyword arguments, *optional*):\n            Additional keyword arguments that will be split in two: all arguments relevant to the Hub (such as\n            `cache_dir`, `revision`, `subfolder`) will be used when downloading the files for your tool, and the others\n            will be passed along to its init.\n    \"\"\"\n    return Tool.from_hub(\n        repo_id,\n        model_repo_id=model_repo_id,\n        token=token,\n        trust_remote_code=trust_remote_code,\n        **kwargs,\n    )\n\n\ndef add_description(description):\n    \"\"\"\n    A decorator that adds a description to a function.\n    \"\"\"\n\n    def inner(func):\n        func.description = description\n        func.name = func.__name__\n        return func\n\n    return inner\n\n\nclass ToolCollection:\n    \"\"\"\n    Tool collections enable loading a collection of tools in the agent's toolbox.\n\n    Collections can be loaded from a collection in the Hub or from an MCP server, see:\n    - [`ToolCollection.from_hub`]\n    - [`ToolCollection.from_mcp`]\n\n    For example and usage, see: [`ToolCollection.from_hub`] and [`ToolCollection.from_mcp`]\n    \"\"\"\n\n    def __init__(self, tools: list[Tool]):\n        self.tools = tools\n\n    @classmethod\n    def from_hub(\n        cls,\n        collection_slug: str,\n        token: str | None = None,\n        trust_remote_code: bool = False,\n    ) -> \"ToolCollection\":\n        \"\"\"Loads a tool collection from the Hub.\n\n        it adds a collection of tools from all Spaces in the collection to the agent's toolbox\n\n        > [!NOTE]\n        > Only Spaces will be fetched, so you can feel free to add models and datasets to your collection if you'd\n        > like for this collection to showcase them.\n\n        Args:\n            collection_slug (str): The collection slug referencing the collection.\n            token (str, *optional*): The authentication token if the collection is private.\n            trust_remote_code (bool, *optional*, defaults to False): Whether to trust the remote code.\n\n        Returns:\n            ToolCollection: A tool collection instance loaded with the tools.\n\n        Example:\n        ```py\n        >>> from smolagents import ToolCollection, CodeAgent\n\n        >>> image_tool_collection = ToolCollection.from_hub(\"huggingface-tools/diffusion-tools-6630bb19a942c2306a2cdb6f\")\n        >>> agent = CodeAgent(tools=[*image_tool_collection.tools], add_base_tools=True)\n\n        >>> agent.run(\"Please draw me a picture of rivers and lakes.\")\n        ```\n        \"\"\"\n        _collection = get_collection(collection_slug, token=token)\n        _hub_repo_ids = {item.item_id for item in _collection.items if item.item_type == \"space\"}\n\n        tools = [Tool.from_hub(repo_id, token, trust_remote_code) for repo_id in _hub_repo_ids]\n\n        return cls(tools)\n\n    @classmethod\n    @contextmanager\n    def from_mcp(\n        cls,\n        server_parameters: \"mcp.StdioServerParameters\" | dict,\n        trust_remote_code: bool = False,\n        structured_output: bool | None = None,\n    ) -> \"ToolCollection\":\n        \"\"\"Automatically load a tool collection from an MCP server.\n\n        This method supports Stdio, Streamable HTTP, and legacy HTTP+SSE MCP servers. Look at the `server_parameters`\n        argument for more details on how to connect to each MCP server.\n\n        Note: a separate thread will be spawned to run an asyncio event loop handling\n        the MCP server.\n\n        Args:\n            server_parameters (`mcp.StdioServerParameters` or `dict`):\n                Configuration parameters to connect to the MCP server. This can be:\n\n                - An instance of `mcp.StdioServerParameters` for connecting a Stdio MCP server via standard input/output using a subprocess.\n\n                - A `dict` with at least:\n                  - \"url\": URL of the server.\n                  - \"transport\": Transport protocol to use, one of:\n                    - \"streamable-http\": Streamable HTTP transport (default).\n                    - \"sse\": Legacy HTTP+SSE transport (deprecated).\n            trust_remote_code (`bool`, *optional*, defaults to `False`):\n                Whether to trust the execution of code from tools defined on the MCP server.\n                This option should only be set to `True` if you trust the MCP server,\n                and undertand the risks associated with running remote code on your local machine.\n                If set to `False`, loading tools from MCP will fail.\n            structured_output (`bool`, *optional*, defaults to `False`):\n                Whether to enable structured output features for MCP tools. If True, enables:\n                - Support for outputSchema in MCP tools\n                - Structured content handling (structuredContent from MCP responses)\n                - JSON parsing fallback for structured data\n                If False, uses the original simple text-only behavior for backwards compatibility.\n\n        Returns:\n            ToolCollection: A tool collection instance.\n\n        Example with a Stdio MCP server:\n        ```py\n        >>> import os\n        >>> from smolagents import ToolCollection, CodeAgent, InferenceClientModel\n        >>> from mcp import StdioServerParameters\n\n        >>> model = InferenceClientModel()\n\n        >>> server_parameters = StdioServerParameters(\n        >>>     command=\"uvx\",\n        >>>     args=[\"--quiet\", \"pubmedmcp@0.1.3\"],\n        >>>     env={\"UV_PYTHON\": \"3.12\", **os.environ},\n        >>> )\n\n        >>> with ToolCollection.from_mcp(server_parameters, trust_remote_code=True) as tool_collection:\n        >>>     agent = CodeAgent(tools=[*tool_collection.tools], add_base_tools=True, model=model)\n        >>>     agent.run(\"Please find a remedy for hangover.\")\n        ```\n\n        Example with structured output enabled:\n        ```py\n        >>> with ToolCollection.from_mcp(server_parameters, trust_remote_code=True, structured_output=True) as tool_collection:\n        >>>     agent = CodeAgent(tools=[*tool_collection.tools], add_base_tools=True, model=model)\n        >>>     agent.run(\"Please find a remedy for hangover.\")\n        ```\n\n        Example with a Streamable HTTP MCP server:\n        ```py\n        >>> with ToolCollection.from_mcp({\"url\": \"http://127.0.0.1:8000/mcp\", \"transport\": \"streamable-http\"}, trust_remote_code=True) as tool_collection:\n        >>>     agent = CodeAgent(tools=[*tool_collection.tools], add_base_tools=True, model=model)\n        >>>     agent.run(\"Please find a remedy for hangover.\")\n        ```\n        \"\"\"\n        # Handle future warning for structured_output default value change\n        if structured_output is None:\n            warnings.warn(\n                \"Parameter 'structured_output' was not specified. \"\n                \"Currently it defaults to False, but in version 1.25, the default will change to True. \"\n                \"To suppress this warning, explicitly set structured_output=True (new behavior) or structured_output=False (legacy behavior). \"\n                \"See documentation at https://huggingface.co/docs/smolagents/tutorials/tools#structured-output-and-output-schema-support for more details.\",\n                FutureWarning,\n                stacklevel=2,\n            )\n            structured_output = False\n\n        try:\n            from mcpadapt.core import MCPAdapt\n            from mcpadapt.smolagents_adapter import SmolAgentsAdapter\n        except ImportError:\n            raise ImportError(\n                \"\"\"Please install 'mcp' extra to use ToolCollection.from_mcp: `pip install 'smolagents[mcp]'`.\"\"\"\n            )\n        if isinstance(server_parameters, dict):\n            transport = server_parameters.get(\"transport\")\n            if transport is None:\n                transport = \"streamable-http\"\n                server_parameters[\"transport\"] = transport\n            if transport not in {\"sse\", \"streamable-http\"}:\n                raise ValueError(\n                    f\"Unsupported transport: {transport}. Supported transports are 'streamable-http' and 'sse'.\"\n                )\n        if not trust_remote_code:\n            raise ValueError(\n                \"Loading tools from MCP requires you to acknowledge you trust the MCP server, \"\n                \"as it will execute code on your local machine: pass `trust_remote_code=True`.\"\n            )\n        with MCPAdapt(server_parameters, SmolAgentsAdapter(structured_output=structured_output)) as tools:\n            yield cls(tools)\n\n\ndef tool(tool_function: Callable) -> Tool:\n    \"\"\"\n    Convert a function into an instance of a dynamically created Tool subclass.\n\n    Args:\n        tool_function (`Callable`): Function to convert into a Tool subclass.\n            Should have type hints for each input and a type hint for the output.\n            Should also have a docstring including the description of the function\n            and an 'Args:' part where each argument is described.\n    \"\"\"\n    tool_json_schema = get_json_schema(tool_function)[\"function\"]\n    if \"return\" not in tool_json_schema:\n        if len(tool_json_schema[\"parameters\"][\"properties\"]) == 0:\n            tool_json_schema[\"return\"] = {\"type\": \"null\"}\n        else:\n            raise TypeHintParsingException(\n                \"Tool return type not found: make sure your function has a return type hint!\"\n            )\n\n    class SimpleTool(Tool):\n        def __init__(self):\n            self.is_initialized = True\n\n    # Set the class attributes\n    SimpleTool.name = tool_json_schema[\"name\"]\n    SimpleTool.description = tool_json_schema[\"description\"]\n    SimpleTool.inputs = tool_json_schema[\"parameters\"][\"properties\"]\n    SimpleTool.output_type = tool_json_schema[\"return\"][\"type\"]\n\n    # Set output_schema if it exists in the JSON schema\n    if \"output_schema\" in tool_json_schema:\n        SimpleTool.output_schema = tool_json_schema[\"output_schema\"]\n    elif \"return\" in tool_json_schema and \"schema\" in tool_json_schema[\"return\"]:\n        SimpleTool.output_schema = tool_json_schema[\"return\"][\"schema\"]\n\n    @wraps(tool_function)\n    def wrapped_function(*args, **kwargs):\n        return tool_function(*args, **kwargs)\n\n    # Bind the copied function to the forward method\n    SimpleTool.forward = staticmethod(wrapped_function)\n\n    # Get the signature parameters of the tool function\n    sig = inspect.signature(tool_function)\n    # - Add \"self\" as first parameter to tool_function signature\n    new_sig = sig.replace(\n        parameters=[inspect.Parameter(\"self\", inspect.Parameter.POSITIONAL_OR_KEYWORD)] + list(sig.parameters.values())\n    )\n    # - Set the signature of the forward method\n    SimpleTool.forward.__signature__ = new_sig\n\n    # Create and attach the source code of the dynamically created tool class and forward method\n    # - Get the source code of tool_function\n    tool_source = textwrap.dedent(inspect.getsource(tool_function))\n    # - Remove the tool decorator and function definition line\n    lines = tool_source.splitlines()\n    tree = ast.parse(tool_source)\n    #   - Find function definition\n    func_node = next((node for node in ast.walk(tree) if isinstance(node, ast.FunctionDef)), None)\n    if not func_node:\n        raise ValueError(\n            f\"No function definition found in the provided source of {tool_function.__name__}. \"\n            \"Ensure the input is a standard function.\"\n        )\n    #   - Extract decorator lines\n    decorator_lines = \"\"\n    if func_node.decorator_list:\n        tool_decorators = [d for d in func_node.decorator_list if isinstance(d, ast.Name) and d.id == \"tool\"]\n        if len(tool_decorators) > 1:\n            raise ValueError(\n                f\"Multiple @tool decorators found on function '{func_node.name}'. Only one @tool decorator is allowed.\"\n            )\n        if len(tool_decorators) < len(func_node.decorator_list):\n            warnings.warn(\n                f\"Function '{func_node.name}' has decorators other than @tool. \"\n                \"This may cause issues with serialization in the remote executor. See issue #1626.\"\n            )\n        decorator_start = tool_decorators[0].end_lineno if tool_decorators else 0\n        decorator_end = func_node.decorator_list[-1].end_lineno\n        decorator_lines = \"\\n\".join(lines[decorator_start:decorator_end])\n    #   - Extract tool source body\n    body_start = func_node.body[0].lineno - 1  # AST lineno starts at 1\n    tool_source_body = \"\\n\".join(lines[body_start:])\n    # - Create the forward method source, including def line and indentation\n    forward_method_source = f\"def forward{new_sig}:\\n{tool_source_body}\"\n    # - Create the class source\n    indent = \" \" * 4  # for class method\n    class_source = (\n        textwrap.dedent(f\"\"\"\n        class SimpleTool(Tool):\n            name: str = \"{tool_json_schema[\"name\"]}\"\n            description: str = {json.dumps(textwrap.dedent(tool_json_schema[\"description\"]).strip())}\n            inputs: dict[str, dict[str, str]] = {tool_json_schema[\"parameters\"][\"properties\"]}\n            output_type: str = \"{tool_json_schema[\"return\"][\"type\"]}\"\n\n            def __init__(self):\n                self.is_initialized = True\n\n        \"\"\")\n        + textwrap.indent(decorator_lines, indent)\n        + textwrap.indent(forward_method_source, indent)\n    )\n    # - Store the source code on both class and method for inspection\n    SimpleTool.__source__ = class_source\n    SimpleTool.forward.__source__ = forward_method_source\n\n    simple_tool = SimpleTool()\n    return simple_tool\n\n\nclass PipelineTool(Tool):\n    \"\"\"\n    A [`Tool`] tailored towards Transformer models. On top of the class attributes of the base class [`Tool`], you will\n    need to specify:\n\n    - **model_class** (`type`) -- The class to use to load the model in this tool.\n    - **default_checkpoint** (`str`) -- The default checkpoint that should be used when the user doesn't specify one.\n    - **pre_processor_class** (`type`, *optional*, defaults to [`transformers.AutoProcessor`]) -- The class to use to load the\n      pre-processor\n    - **post_processor_class** (`type`, *optional*, defaults to [`transformers.AutoProcessor`]) -- The class to use to load the\n      post-processor (when different from the pre-processor).\n\n    Args:\n        model (`str` or [`transformers.PreTrainedModel`], *optional*):\n            The name of the checkpoint to use for the model, or the instantiated model. If unset, will default to the\n            value of the class attribute `default_checkpoint`.\n        pre_processor (`str` or `Any`, *optional*):\n            The name of the checkpoint to use for the pre-processor, or the instantiated pre-processor (can be a\n            tokenizer, an image processor, a feature extractor or a processor). Will default to the value of `model` if\n            unset.\n        post_processor (`str` or `Any`, *optional*):\n            The name of the checkpoint to use for the post-processor, or the instantiated pre-processor (can be a\n            tokenizer, an image processor, a feature extractor or a processor). Will default to the `pre_processor` if\n            unset.\n        device (`int`, `str` or `torch.device`, *optional*):\n            The device on which to execute the model. Will default to any accelerator available (GPU, MPS etc...), the\n            CPU otherwise.\n        device_map (`str` or `dict`, *optional*):\n            If passed along, will be used to instantiate the model.\n        model_kwargs (`dict`, *optional*):\n            Any keyword argument to send to the model instantiation.\n        token (`str`, *optional*):\n            The token to use as HTTP bearer authorization for remote files. If unset, will use the token generated when\n            running `huggingface-cli login` (stored in `~/.huggingface`).\n        hub_kwargs (additional keyword arguments, *optional*):\n            Any additional keyword argument to send to the methods that will load the data from the Hub.\n    \"\"\"\n\n    pre_processor_class = None\n    model_class = None\n    post_processor_class = None\n    default_checkpoint = None\n    description = \"This is a pipeline tool\"\n    name = \"pipeline\"\n    inputs = {\"prompt\": str}\n    output_type = str\n    skip_forward_signature_validation = True\n\n    def __init__(\n        self,\n        model=None,\n        pre_processor=None,\n        post_processor=None,\n        device=None,\n        device_map=None,\n        model_kwargs=None,\n        token=None,\n        **hub_kwargs,\n    ):\n        if not _is_package_available(\"accelerate\") or not _is_package_available(\"torch\"):\n            raise ModuleNotFoundError(\n                \"Please install 'transformers' extra to use a PipelineTool: `pip install 'smolagents[transformers]'`\"\n            )\n\n        if model is None:\n            if self.default_checkpoint is None:\n                raise ValueError(\"This tool does not implement a default checkpoint, you need to pass one.\")\n            model = self.default_checkpoint\n        if pre_processor is None:\n            pre_processor = model\n\n        self.model = model\n        self.pre_processor = pre_processor\n        self.post_processor = post_processor\n        self.device = device\n        self.device_map = device_map\n        self.model_kwargs = {} if model_kwargs is None else model_kwargs\n        if device_map is not None:\n            self.model_kwargs[\"device_map\"] = device_map\n        self.hub_kwargs = hub_kwargs\n        self.hub_kwargs[\"token\"] = token\n\n        super().__init__()\n\n    def setup(self):\n        \"\"\"\n        Instantiates the `pre_processor`, `model` and `post_processor` if necessary.\n        \"\"\"\n        if isinstance(self.pre_processor, str):\n            if self.pre_processor_class is None:\n                from transformers import AutoProcessor\n\n                self.pre_processor_class = AutoProcessor\n            self.pre_processor = self.pre_processor_class.from_pretrained(self.pre_processor, **self.hub_kwargs)\n\n        if isinstance(self.model, str):\n            self.model = self.model_class.from_pretrained(self.model, **self.model_kwargs, **self.hub_kwargs)\n\n        if self.post_processor is None:\n            self.post_processor = self.pre_processor\n        elif isinstance(self.post_processor, str):\n            if self.post_processor_class is None:\n                from transformers import AutoProcessor\n\n                self.post_processor_class = AutoProcessor\n            self.post_processor = self.post_processor_class.from_pretrained(self.post_processor, **self.hub_kwargs)\n\n        if self.device is None:\n            if self.device_map is not None:\n                self.device = list(self.model.hf_device_map.values())[0]\n            else:\n                from accelerate import PartialState\n\n                self.device = PartialState().default_device\n\n        if self.device_map is None:\n            self.model.to(self.device)\n\n        super().setup()\n\n    def encode(self, raw_inputs):\n        \"\"\"\n        Uses the `pre_processor` to prepare the inputs for the `model`.\n        \"\"\"\n        return self.pre_processor(raw_inputs)\n\n    def forward(self, inputs):\n        \"\"\"\n        Sends the inputs through the `model`.\n        \"\"\"\n        import torch\n\n        with torch.no_grad():\n            return self.model(**inputs)\n\n    def decode(self, outputs):\n        \"\"\"\n        Uses the `post_processor` to decode the model output.\n        \"\"\"\n        return self.post_processor(outputs)\n\n    def __call__(self, *args, sanitize_inputs_outputs: bool = False, **kwargs):\n        import torch\n        from accelerate.utils import send_to_device\n\n        if not self.is_initialized:\n            self.setup()\n\n        if sanitize_inputs_outputs:\n            args, kwargs = handle_agent_input_types(*args, **kwargs)\n        encoded_inputs = self.encode(*args, **kwargs)\n\n        tensor_inputs = {k: v for k, v in encoded_inputs.items() if isinstance(v, torch.Tensor)}\n        non_tensor_inputs = {k: v for k, v in encoded_inputs.items() if not isinstance(v, torch.Tensor)}\n\n        encoded_inputs = send_to_device(tensor_inputs, self.device)\n        outputs = self.forward({**encoded_inputs, **non_tensor_inputs})\n        outputs = send_to_device(outputs, \"cpu\")\n        decoded_outputs = self.decode(outputs)\n        if sanitize_inputs_outputs:\n            decoded_outputs = handle_agent_output_types(decoded_outputs, self.output_type)\n        return decoded_outputs\n\n\ndef get_tools_definition_code(tools: dict[str, Tool]) -> str:\n    tool_codes = []\n    for tool in tools.values():\n        validate_tool_attributes(tool.__class__, check_imports=False)\n        tool_code = instance_to_source(tool, base_cls=Tool)\n        tool_code = tool_code.replace(\"from smolagents.tools import Tool\", \"\")\n        tool_code += f\"\\n\\n{tool.name} = {tool.__class__.__name__}()\\n\"\n        tool_codes.append(tool_code)\n\n    tool_definition_code = \"\\n\".join([f\"import {module}\" for module in BASE_BUILTIN_MODULES])\n    tool_definition_code += textwrap.dedent(\n        \"\"\"\n    from typing import Any\n\n    class Tool:\n        def __call__(self, *args, **kwargs):\n            return self.forward(*args, **kwargs)\n\n        def forward(self, *args, **kwargs):\n            pass # to be implemented in child class\n    \"\"\"\n    )\n    tool_definition_code += \"\\n\\n\".join(tool_codes)\n    return tool_definition_code\n\n\ndef validate_tool_arguments(tool: Tool, arguments: Any) -> None:\n    \"\"\"Validate tool arguments against tool's input schema.\n\n    Checks that all provided arguments match the tool's expected input types and that\n    all required arguments are present. Supports both dictionary arguments and single\n    value arguments for tools with one input parameter.\n\n    Args:\n        tool (`Tool`): Tool whose input schema will be used for validation.\n        arguments (`Any`): Arguments to validate. Can be a dictionary mapping\n            argument names to values, or a single value for tools with one input.\n\n\n    Raises:\n        ValueError: If an argument is not in the tool's input schema, if a required\n            argument is missing, or if the argument value doesn't match the expected type.\n        TypeError: If an argument has an incorrect type that cannot be converted\n            (e.g., string instead of number, excluding integer to number conversion).\n\n    Note:\n        - Supports type coercion from integer to number\n        - Handles nullable parameters when explicitly marked in the schema\n        - Accepts \"any\" type as a wildcard that matches all types\n    \"\"\"\n    if isinstance(arguments, dict):\n        for key, value in arguments.items():\n            if key not in tool.inputs:\n                raise ValueError(f\"Argument {key} is not in the tool's input schema\")\n\n            actual_type = _get_json_schema_type(type(value))[\"type\"]\n            expected_type = tool.inputs[key][\"type\"]\n            expected_type_is_nullable = tool.inputs[key].get(\"nullable\", False)\n\n            # Type is valid if it matches, is \"any\", or is null for nullable parameters\n            if (\n                (actual_type != expected_type if isinstance(expected_type, str) else actual_type not in expected_type)\n                and expected_type != \"any\"\n                and not (actual_type == \"null\" and expected_type_is_nullable)\n            ):\n                if actual_type == \"integer\" and expected_type == \"number\":\n                    continue\n                raise TypeError(f\"Argument {key} has type '{actual_type}' but should be '{tool.inputs[key]['type']}'\")\n\n        for key, schema in tool.inputs.items():\n            key_is_nullable = schema.get(\"nullable\", False)\n            if key not in arguments and not key_is_nullable:\n                raise ValueError(f\"Argument {key} is required\")\n        return None\n    else:\n        expected_type = list(tool.inputs.values())[0][\"type\"]\n        if _get_json_schema_type(type(arguments))[\"type\"] != expected_type and not expected_type == \"any\":\n            raise TypeError(f\"Argument has type '{type(arguments).__name__}' but should be '{expected_type}'\")\n\n\n__all__ = [\n    \"AUTHORIZED_TYPES\",\n    \"Tool\",\n    \"tool\",\n    \"load_tool\",\n    \"launch_gradio_demo\",\n    \"ToolCollection\",\n]\n"
  },
  {
    "path": "src/smolagents/utils.py",
    "content": "#!/usr/bin/env python\n# coding=utf-8\n\n# Copyright 2024 The HuggingFace Inc. team. All rights reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\nimport ast\nimport base64\nimport importlib.util\nimport inspect\nimport json\nimport keyword\nimport os\nimport random\nimport re\nimport time\nfrom functools import lru_cache\nfrom io import BytesIO\nfrom logging import Logger\nfrom pathlib import Path\nfrom textwrap import dedent\nfrom typing import TYPE_CHECKING, Any, Callable\n\nimport jinja2\n\n\nif TYPE_CHECKING:\n    from smolagents.memory import AgentLogger\n\n\n__all__ = [\"AgentError\"]\n\n\n@lru_cache\ndef _is_package_available(package_name: str) -> bool:\n    return importlib.util.find_spec(package_name) is not None\n\n\nBASE_BUILTIN_MODULES = [\n    \"collections\",\n    \"datetime\",\n    \"itertools\",\n    \"math\",\n    \"queue\",\n    \"random\",\n    \"re\",\n    \"stat\",\n    \"statistics\",\n    \"time\",\n    \"unicodedata\",\n]\n\n\ndef sanitize_for_rich(value) -> str:\n    \"\"\"\n    Convert arbitrary values (including bytes / control characters) into a safe string for Rich.\n    - Decodes bytes-like inputs using UTF-8 with replacement.\n    - Escapes bracket sequences that could be interpreted as markup while preserving valid Rich tags.\n    - Replaces ASCII control characters (except common whitespace) with visible escape sequences.\n    \"\"\"\n    if value is None:\n        s = \"\"\n    elif isinstance(value, str):\n        s = value\n    elif isinstance(value, (bytes, bytearray, memoryview)):\n        s = bytes(value).decode(\"utf-8\", errors=\"replace\")\n    else:\n        s = str(value)\n\n    out: list[str] = []\n    for ch in s:\n        code = ord(ch)\n        if ch in (\"\\n\", \"\\t\", \"\\r\"):\n            out.append(ch)\n        elif code < 32 or code == 127:\n            out.append(f\"\\\\x{code:02x}\")\n        else:\n            out.append(ch)\n    return \"\".join(out)\n\n\nclass AgentError(Exception):\n    \"\"\"Base class for other agent-related exceptions\"\"\"\n\n    def __init__(self, message, logger: \"AgentLogger\"):\n        super().__init__(message)\n        self.message = message\n        logger.log_error(message)\n\n    def dict(self) -> dict[str, str]:\n        return {\"type\": self.__class__.__name__, \"message\": str(self.message)}\n\n\nclass AgentParsingError(AgentError):\n    \"\"\"Exception raised for errors in parsing in the agent\"\"\"\n\n    pass\n\n\nclass AgentExecutionError(AgentError):\n    \"\"\"Exception raised for errors in execution in the agent\"\"\"\n\n    pass\n\n\nclass AgentMaxStepsError(AgentError):\n    \"\"\"Exception raised for errors in execution in the agent\"\"\"\n\n    pass\n\n\nclass AgentToolCallError(AgentExecutionError):\n    \"\"\"Exception raised for errors when incorrect arguments are passed to the tool\"\"\"\n\n    pass\n\n\nclass AgentToolExecutionError(AgentExecutionError):\n    \"\"\"Exception raised for errors when executing a tool\"\"\"\n\n    pass\n\n\nclass AgentGenerationError(AgentError):\n    \"\"\"Exception raised for errors in generation in the agent\"\"\"\n\n    pass\n\n\ndef make_json_serializable(obj: Any) -> Any:\n    \"\"\"Recursive function to make objects JSON serializable\"\"\"\n    if obj is None:\n        return None\n    elif isinstance(obj, (str, int, float, bool)):\n        # Try to parse string as JSON if it looks like a JSON object/array\n        if isinstance(obj, str):\n            try:\n                if (obj.startswith(\"{\") and obj.endswith(\"}\")) or (obj.startswith(\"[\") and obj.endswith(\"]\")):\n                    parsed = json.loads(obj)\n                    return make_json_serializable(parsed)\n            except json.JSONDecodeError:\n                pass\n        return obj\n    elif isinstance(obj, (list, tuple)):\n        return [make_json_serializable(item) for item in obj]\n    elif isinstance(obj, dict):\n        return {str(k): make_json_serializable(v) for k, v in obj.items()}\n    elif hasattr(obj, \"__dict__\"):\n        # For custom objects, convert their __dict__ to a serializable format\n        return {\"_type\": obj.__class__.__name__, **{k: make_json_serializable(v) for k, v in obj.__dict__.items()}}\n    else:\n        # For any other type, convert to string\n        return str(obj)\n\n\ndef parse_json_blob(json_blob: str) -> tuple[dict[str, str], str]:\n    \"Extracts the JSON blob from the input and returns the JSON data and the rest of the input.\"\n    try:\n        first_accolade_index = json_blob.find(\"{\")\n        last_accolade_index = [a.start() for a in list(re.finditer(\"}\", json_blob))][-1]\n        json_str = json_blob[first_accolade_index : last_accolade_index + 1]\n        json_data = json.loads(json_str, strict=False)\n        return json_data, json_blob[:first_accolade_index]\n    except IndexError:\n        raise ValueError(\"The model output does not contain any JSON blob.\")\n    except json.JSONDecodeError as e:\n        place = e.pos\n        if json_blob[place - 1 : place + 2] == \"},\\n\":\n            raise ValueError(\n                \"JSON is invalid: you probably tried to provide multiple tool calls in one action. PROVIDE ONLY ONE TOOL CALL.\"\n            )\n        raise ValueError(\n            f\"The JSON blob you used is invalid due to the following error: {e}.\\n\"\n            f\"JSON blob was: {json_blob}, decoding failed on that specific part of the blob:\\n\"\n            f\"'{json_blob[place - 4 : place + 5]}'.\"\n        )\n\n\ndef extract_code_from_text(text: str, code_block_tags: tuple[str, str]) -> str | None:\n    \"\"\"Extract code from the LLM's output.\"\"\"\n    pattern = rf\"{code_block_tags[0]}(.*?){code_block_tags[1]}\"\n    matches = re.findall(pattern, text, re.DOTALL)\n    if matches:\n        return \"\\n\\n\".join(match.strip() for match in matches)\n    return None\n\n\ndef parse_code_blobs(text: str, code_block_tags: tuple[str, str]) -> str:\n    \"\"\"Extract code blocs from the LLM's output.\n\n    If a valid code block is passed, it returns it directly.\n\n    Args:\n        text (`str`): LLM's output text to parse.\n\n    Returns:\n        `str`: Extracted code block.\n\n    Raises:\n        ValueError: If no valid code block is found in the text.\n    \"\"\"\n    matches = extract_code_from_text(text, code_block_tags)\n    if not matches:  # Fallback to markdown pattern\n        matches = extract_code_from_text(text, (\"```(?:python|py)\", \"\\n```\"))\n    if matches:\n        return matches\n    # Maybe the LLM outputted a code blob directly\n    try:\n        ast.parse(text)\n        return text\n    except SyntaxError:\n        pass\n\n    if \"final\" in text and \"answer\" in text:\n        raise ValueError(\n            dedent(\n                f\"\"\"\n                Your code snippet is invalid, because the regex pattern {code_block_tags[0]}(.*?){code_block_tags[1]} was not found in it.\n                Here is your code snippet:\n                {text}\n                It seems like you're trying to return the final answer, you can do it as follows:\n                {code_block_tags[0]}\n                final_answer(\"YOUR FINAL ANSWER HERE\")\n                {code_block_tags[1]}\n                \"\"\"\n            ).strip()\n        )\n    raise ValueError(\n        dedent(\n            f\"\"\"\n            Your code snippet is invalid, because the regex pattern {code_block_tags[0]}(.*?){code_block_tags[1]} was not found in it.\n            Here is your code snippet:\n            {text}\n            Make sure to include code with the correct pattern, for instance:\n            Thoughts: Your thoughts\n            {code_block_tags[0]}\n            # Your python code here\n            {code_block_tags[1]}\n            \"\"\"\n        ).strip()\n    )\n\n\nMAX_LENGTH_TRUNCATE_CONTENT = 20000\n\n\ndef truncate_content(content: str, max_length: int = MAX_LENGTH_TRUNCATE_CONTENT) -> str:\n    if len(content) <= max_length:\n        return content\n    else:\n        return (\n            content[: max_length // 2]\n            + f\"\\n..._This content has been truncated to stay below {max_length} characters_...\\n\"\n            + content[-max_length // 2 :]\n        )\n\n\nclass ImportFinder(ast.NodeVisitor):\n    def __init__(self):\n        self.packages = set()\n\n    def visit_Import(self, node):\n        for alias in node.names:\n            # Get the base package name (before any dots)\n            base_package = alias.name.split(\".\")[0]\n            self.packages.add(base_package)\n\n    def visit_ImportFrom(self, node):\n        if node.module:  # for \"from x import y\" statements\n            # Get the base package name (before any dots)\n            base_package = node.module.split(\".\")[0]\n            self.packages.add(base_package)\n\n\ndef instance_to_source(instance, base_cls=None):\n    \"\"\"Convert an instance to its class source code representation.\"\"\"\n    cls = instance.__class__\n    class_name = cls.__name__\n\n    # Start building class lines\n    class_lines = []\n    if base_cls:\n        class_lines.append(f\"class {class_name}({base_cls.__name__}):\")\n    else:\n        class_lines.append(f\"class {class_name}:\")\n\n    # Add docstring if it exists and differs from base\n    if cls.__doc__ and (not base_cls or cls.__doc__ != base_cls.__doc__):\n        class_lines.append(f'    \"\"\"{cls.__doc__}\"\"\"')\n\n    # Add class-level attributes\n    class_attrs = {\n        name: value\n        for name, value in cls.__dict__.items()\n        if not name.startswith(\"__\")\n        and not name == \"_abc_impl\"\n        and not callable(value)\n        and not (base_cls and hasattr(base_cls, name) and getattr(base_cls, name) == value)\n    }\n\n    for name, value in class_attrs.items():\n        if isinstance(value, str):\n            # multiline value\n            if \"\\n\" in value:\n                escaped_value = value.replace('\"\"\"', r\"\\\"\\\"\\\"\")  # Escape triple quotes\n                class_lines.append(f'    {name} = \"\"\"{escaped_value}\"\"\"')\n            else:\n                class_lines.append(f\"    {name} = {json.dumps(value)}\")\n        else:\n            class_lines.append(f\"    {name} = {repr(value)}\")\n\n    if class_attrs:\n        class_lines.append(\"\")\n\n    # Add methods\n    methods = {\n        name: func.__wrapped__ if hasattr(func, \"__wrapped__\") else func\n        for name, func in cls.__dict__.items()\n        if callable(func)\n        and (\n            not base_cls\n            or not hasattr(base_cls, name)\n            or (\n                isinstance(func, (staticmethod, classmethod))\n                or (getattr(base_cls, name).__code__.co_code != func.__code__.co_code)\n            )\n        )\n    }\n\n    for name, method in methods.items():\n        method_source = get_source(method)\n        # Clean up the indentation\n        method_lines = method_source.split(\"\\n\")\n        first_line = method_lines[0]\n        indent = len(first_line) - len(first_line.lstrip())\n        method_lines = [line[indent:] for line in method_lines]\n        method_source = \"\\n\".join([\"    \" + line if line.strip() else line for line in method_lines])\n        class_lines.append(method_source)\n        class_lines.append(\"\")\n\n    # Find required imports using ImportFinder\n    import_finder = ImportFinder()\n    import_finder.visit(ast.parse(\"\\n\".join(class_lines)))\n    required_imports = import_finder.packages\n\n    # Build final code with imports\n    final_lines = []\n\n    # Add base class import if needed\n    if base_cls:\n        final_lines.append(f\"from {base_cls.__module__} import {base_cls.__name__}\")\n\n    # Add discovered imports\n    for package in required_imports:\n        final_lines.append(f\"import {package}\")\n\n    if final_lines:  # Add empty line after imports\n        final_lines.append(\"\")\n\n    # Add the class code\n    final_lines.extend(class_lines)\n\n    return \"\\n\".join(final_lines)\n\n\ndef get_source(obj) -> str:\n    \"\"\"Get the source code of a class or callable object (e.g.: function, method).\n    First attempts to get the source code using `inspect.getsource`.\n    In a dynamic environment (e.g.: Jupyter, IPython), if this fails,\n    falls back to retrieving the source code from the current interactive shell session.\n\n    Args:\n        obj: A class or callable object (e.g.: function, method)\n\n    Returns:\n        str: The source code of the object, dedented and stripped\n\n    Raises:\n        TypeError: If object is not a class or callable\n        OSError: If source code cannot be retrieved from any source\n        ValueError: If source cannot be found in IPython history\n\n    Note:\n        TODO: handle Python standard REPL\n    \"\"\"\n    if not (isinstance(obj, type) or callable(obj)):\n        raise TypeError(f\"Expected class or callable, got {type(obj)}\")\n\n    inspect_error = None\n    try:\n        # Handle dynamically created classes\n        source = getattr(obj, \"__source__\", None) or inspect.getsource(obj)\n        return dedent(source).strip()\n    except OSError as e:\n        # let's keep track of the exception to raise it if all further methods fail\n        inspect_error = e\n    try:\n        import IPython\n\n        shell = IPython.get_ipython()\n        if not shell:\n            raise ImportError(\"No active IPython shell found\")\n        all_cells = \"\\n\".join(shell.user_ns.get(\"In\", [])).strip()\n        if not all_cells:\n            raise ValueError(\"No code cells found in IPython session\")\n\n        tree = ast.parse(all_cells)\n        for node in ast.walk(tree):\n            if isinstance(node, (ast.ClassDef, ast.FunctionDef)) and node.name == obj.__name__:\n                return dedent(\"\\n\".join(all_cells.split(\"\\n\")[node.lineno - 1 : node.end_lineno])).strip()\n        raise ValueError(f\"Could not find source code for {obj.__name__} in IPython history\")\n    except ImportError:\n        # IPython is not available, let's just raise the original inspect error\n        raise inspect_error\n    except ValueError as e:\n        # IPython is available but we couldn't find the source code, let's raise the error\n        raise e from inspect_error\n\n\ndef encode_image_base64(image):\n    buffered = BytesIO()\n    image.save(buffered, format=\"PNG\")\n    return base64.b64encode(buffered.getvalue()).decode(\"utf-8\")\n\n\ndef make_image_url(base64_image):\n    return f\"data:image/png;base64,{base64_image}\"\n\n\ndef make_init_file(folder: str | Path):\n    os.makedirs(folder, exist_ok=True)\n    # Create __init__\n    with open(os.path.join(folder, \"__init__.py\"), \"w\"):\n        pass\n\n\ndef is_valid_name(name: str) -> bool:\n    return name.isidentifier() and not keyword.iskeyword(name) if isinstance(name, str) else False\n\n\nAGENT_GRADIO_APP_TEMPLATE = \"\"\"import yaml\nimport os\nfrom smolagents import GradioUI, {{ class_name }}, {{ agent_dict['model']['class'] }}\n\n# Get current directory path\nCURRENT_DIR = os.path.dirname(os.path.abspath(__file__))\n\n{% for tool in tools.values() -%}\nfrom {{managed_agent_relative_path}}tools.{{ tool.name }} import {{ tool.__class__.__name__ }} as {{ tool.name | camelcase }}\n{% endfor %}\n{% for managed_agent in managed_agents.values() -%}\nfrom {{managed_agent_relative_path}}managed_agents.{{ managed_agent.name }}.app import agent_{{ managed_agent.name }}\n{% endfor %}\n\nmodel = {{ agent_dict['model']['class'] }}(\n{% for key in agent_dict['model']['data'] if key != 'class' -%}\n    {{ key }}={{ agent_dict['model']['data'][key]|repr }},\n{% endfor %})\n\n{% for tool in tools.values() -%}\n{{ tool.name }} = {{ tool.name | camelcase }}()\n{% endfor %}\n\nwith open(os.path.join(CURRENT_DIR, \"prompts.yaml\"), 'r') as stream:\n    prompt_templates = yaml.safe_load(stream)\n\n{{ agent_name }} = {{ class_name }}(\n    model=model,\n    tools=[{% for tool_name in tools.keys() if tool_name != \"final_answer\" %}{{ tool_name }}{% if not loop.last %}, {% endif %}{% endfor %}],\n    managed_agents=[{% for subagent_name in managed_agents.keys() %}agent_{{ subagent_name }}{% if not loop.last %}, {% endif %}{% endfor %}],\n    {% for attribute_name, value in agent_dict.items() if attribute_name not in [\"class\", \"model\", \"tools\", \"prompt_templates\", \"authorized_imports\", \"managed_agents\", \"requirements\"] -%}\n    {{ attribute_name }}={{ value|repr }},\n    {% endfor %}prompt_templates=prompt_templates\n)\nif __name__ == \"__main__\":\n    GradioUI({{ agent_name }}).launch()\n\"\"\".strip()\n\n\ndef create_agent_gradio_app_template():\n    env = jinja2.Environment(loader=jinja2.BaseLoader(), undefined=jinja2.StrictUndefined)\n    env.filters[\"repr\"] = repr\n    env.filters[\"camelcase\"] = lambda value: \"\".join(word.capitalize() for word in value.split(\"_\"))\n    return env.from_string(AGENT_GRADIO_APP_TEMPLATE)\n\n\nclass RateLimiter:\n    \"\"\"Simple rate limiter that enforces a minimum delay between consecutive requests.\n\n    This class is useful for limiting the rate of operations such as API requests,\n    by ensuring that calls to `throttle()` are spaced out by at least a given interval\n    based on the desired requests per minute.\n\n    If no rate is specified (i.e., `requests_per_minute` is None), rate limiting\n    is disabled and `throttle()` becomes a no-op.\n\n    Args:\n        requests_per_minute (`float | None`): Maximum number of allowed requests per minute.\n            Use `None` to disable rate limiting.\n    \"\"\"\n\n    def __init__(self, requests_per_minute: float | None = None):\n        self._enabled = requests_per_minute is not None\n        self._interval = 60.0 / requests_per_minute if self._enabled else 0.0\n        self._last_call = 0.0\n\n    def throttle(self):\n        \"\"\"Pause execution to respect the rate limit, if enabled.\"\"\"\n        if not self._enabled:\n            return\n        now = time.time()\n        elapsed = now - self._last_call\n        if elapsed < self._interval:\n            time.sleep(self._interval - elapsed)\n        self._last_call = time.time()\n\n\nclass Retrying:\n    \"\"\"Simple retrying controller. Inspired from library [tenacity](https://github.com/jd/tenacity/).\"\"\"\n\n    def __init__(\n        self,\n        max_attempts: int = 1,\n        wait_seconds: float = 0.0,\n        exponential_base: float = 2.0,\n        jitter: bool = True,\n        retry_predicate: Callable[[BaseException], bool] | None = None,\n        reraise: bool = False,\n        before_sleep_logger: tuple[Logger, int] | None = None,\n        after_logger: tuple[Logger, int] | None = None,\n    ):\n        self.max_attempts = max_attempts\n        self.wait_seconds = wait_seconds\n        self.exponential_base = exponential_base\n        self.jitter = jitter\n        self.retry_predicate = retry_predicate\n        self.reraise = reraise\n        self.before_sleep_logger = before_sleep_logger\n        self.after_logger = after_logger\n\n    def __call__(self, fn, *args: Any, **kwargs: Any) -> Any:\n        start_time = time.time()\n        delay = self.wait_seconds\n\n        for attempt_number in range(1, self.max_attempts + 1):\n            try:\n                result = fn(*args, **kwargs)\n\n                # Log after successful call if we had retries\n                if self.after_logger and attempt_number > 1:\n                    logger, log_level = self.after_logger\n                    seconds = time.time() - start_time\n                    fn_name = getattr(fn, \"__name__\", repr(fn))\n                    logger.log(\n                        log_level,\n                        f\"Finished call to '{fn_name}' after {seconds:.3f}(s), this was attempt n°{attempt_number}/{self.max_attempts}.\",\n                    )\n\n                return result\n\n            except BaseException as e:\n                # Check if we should retry\n                should_retry = self.retry_predicate(e) if self.retry_predicate else False\n\n                # If this is the last attempt or we shouldn't retry, raise\n                if not should_retry or attempt_number >= self.max_attempts:\n                    if self.reraise:\n                        raise\n                    raise\n\n                # Log after failed attempt\n                if self.after_logger:\n                    logger, log_level = self.after_logger\n                    seconds = time.time() - start_time\n                    fn_name = getattr(fn, \"__name__\", repr(fn))\n                    logger.log(\n                        log_level,\n                        f\"Finished call to '{fn_name}' after {seconds:.3f}(s), this was attempt n°{attempt_number}/{self.max_attempts}.\",\n                    )\n\n                # Exponential backoff with jitter\n                # https://cookbook.openai.com/examples/how_to_handle_rate_limits#example-3-manual-backoff-implementation\n                delay *= self.exponential_base * (1 + self.jitter * random.random())\n\n                # Log before sleeping\n                if self.before_sleep_logger:\n                    logger, log_level = self.before_sleep_logger\n                    fn_name = getattr(fn, \"__name__\", repr(fn))\n                    logger.log(\n                        log_level,\n                        f\"Retrying {fn_name} in {delay} seconds as it raised {e.__class__.__name__}: {e}.\",\n                    )\n\n                # Sleep before next attempt\n                if delay > 0:\n                    time.sleep(delay)\n"
  },
  {
    "path": "src/smolagents/vision_web_browser.py",
    "content": "import argparse\nfrom io import BytesIO\nfrom time import sleep\n\nimport helium\nimport PIL.Image\nfrom dotenv import load_dotenv\nfrom selenium import webdriver\nfrom selenium.webdriver.common.by import By\nfrom selenium.webdriver.common.keys import Keys\n\nfrom smolagents import CodeAgent, WebSearchTool, tool\nfrom smolagents.agents import ActionStep\nfrom smolagents.cli import load_model\n\n\ngithub_request = \"\"\"\nI'm trying to find how hard I have to work to get a repo in github.com/trending.\nCan you navigate to the profile for the top author of the top trending repo, and give me their total number of commits over the last year?\n\"\"\"  # The agent is able to achieve this request only when powered by GPT-4o or Claude-3.5-sonnet.\n\nsearch_request = \"\"\"\nPlease navigate to https://en.wikipedia.org/wiki/Chicago and give me a sentence containing the word \"1992\" that mentions a construction accident.\n\"\"\"\n\n\ndef parse_arguments():\n    parser = argparse.ArgumentParser(description=\"Run a web browser automation script with a specified model.\")\n    parser.add_argument(\n        \"prompt\",\n        type=str,\n        nargs=\"?\",  # Makes it optional\n        default=search_request,\n        help=\"The prompt to run with the agent\",\n    )\n    parser.add_argument(\n        \"--model-type\",\n        type=str,\n        default=\"LiteLLMModel\",\n        help=\"The model type to use (e.g., OpenAIModel, LiteLLMModel, TransformersModel, InferenceClientModel)\",\n    )\n    parser.add_argument(\n        \"--model-id\",\n        type=str,\n        default=\"gpt-4o\",\n        help=\"The model ID to use for the specified model type\",\n    )\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        help=\"The inference provider to use for the model\",\n    )\n    parser.add_argument(\n        \"--api-base\",\n        type=str,\n        help=\"The API base to use for the model\",\n    )\n    parser.add_argument(\n        \"--api-key\",\n        type=str,\n        help=\"The API key to use for the model\",\n    )\n    return parser.parse_args()\n\n\ndef save_screenshot(memory_step: ActionStep, agent: CodeAgent) -> None:\n    sleep(1.0)  # Let JavaScript animations happen before taking the screenshot\n    driver = helium.get_driver()\n    current_step = memory_step.step_number\n    if driver is not None:\n        for previous_memory_step in agent.memory.steps:  # Remove previous screenshots from logs for lean processing\n            if isinstance(previous_memory_step, ActionStep) and previous_memory_step.step_number <= current_step - 2:\n                previous_memory_step.observations_images = None\n        png_bytes = driver.get_screenshot_as_png()\n        image = PIL.Image.open(BytesIO(png_bytes))\n        print(f\"Captured a browser screenshot: {image.size} pixels\")\n        memory_step.observations_images = [image.copy()]  # Create a copy to ensure it persists, important!\n\n    # Update observations with current URL\n    url_info = f\"Current url: {driver.current_url}\"\n    memory_step.observations = (\n        url_info if memory_step.observations is None else memory_step.observations + \"\\n\" + url_info\n    )\n    return\n\n\ndef _escape_xpath_string(s: str) -> str:\n    \"\"\"\n    Escapes a string for safe use in an XPath expression.\n\n    Args:\n        s (`str`): Arbitrary input string to escape.\n\n    Returns:\n        `str`: Valid XPath expression representing the literal value of `s`.\n    \"\"\"\n    if \"'\" not in s:\n        return f\"'{s}'\"\n    if '\"' not in s:\n        return f'\"{s}\"'\n    parts = s.split(\"'\")\n    return \"concat(\" + ', \"\\'\", '.join(f\"'{p}'\" for p in parts) + \")\"\n\n\n@tool\ndef search_item_ctrl_f(text: str, nth_result: int = 1) -> str:\n    \"\"\"\n    Searches for text on the current page via Ctrl + F and jumps to the nth occurrence.\n    Args:\n        text: The text to search for\n        nth_result: Which occurrence to jump to (default: 1)\n    \"\"\"\n    escaped_text = _escape_xpath_string(text)\n    elements = driver.find_elements(By.XPATH, f\"//*[contains(text(), {escaped_text})]\")\n    if nth_result > len(elements):\n        raise Exception(f\"Match n°{nth_result} not found (only {len(elements)} matches found)\")\n    result = f\"Found {len(elements)} matches for '{text}'.\"\n    elem = elements[nth_result - 1]\n    driver.execute_script(\"arguments[0].scrollIntoView(true);\", elem)\n    result += f\"Focused on element {nth_result} of {len(elements)}\"\n    return result\n\n\n@tool\ndef go_back() -> None:\n    \"\"\"Goes back to previous page.\"\"\"\n    driver.back()\n\n\n@tool\ndef close_popups() -> str:\n    \"\"\"\n    Closes any visible modal or pop-up on the page. Use this to dismiss pop-up windows! This does not work on cookie consent banners.\n    \"\"\"\n    webdriver.ActionChains(driver).send_keys(Keys.ESCAPE).perform()\n\n\ndef initialize_driver():\n    \"\"\"Initialize the Selenium WebDriver.\"\"\"\n    chrome_options = webdriver.ChromeOptions()\n    chrome_options.add_argument(\"--force-device-scale-factor=1\")\n    chrome_options.add_argument(\"--window-size=1000,1350\")\n    chrome_options.add_argument(\"--disable-pdf-viewer\")\n    chrome_options.add_argument(\"--window-position=0,0\")\n    return helium.start_chrome(headless=False, options=chrome_options)\n\n\ndef initialize_agent(model):\n    \"\"\"Initialize the CodeAgent with the specified model.\"\"\"\n    return CodeAgent(\n        tools=[WebSearchTool(), go_back, close_popups, search_item_ctrl_f],\n        model=model,\n        additional_authorized_imports=[\"helium\"],\n        step_callbacks=[save_screenshot],\n        max_steps=20,\n        verbosity_level=2,\n    )\n\n\nhelium_instructions = \"\"\"\nUse your web_search tool when you want to get Google search results.\nThen you can use helium to access websites. Don't use helium for Google search, only for navigating websites!\nDon't bother about the helium driver, it's already managed.\nWe've already ran \"from helium import *\"\nThen you can go to pages!\n<code>\ngo_to('github.com/trending')\n</code>\n\nYou can directly click clickable elements by inputting the text that appears on them.\n<code>\nclick(\"Top products\")\n</code>\n\nIf it's a link:\n<code>\nclick(Link(\"Top products\"))\n</code>\n\nIf you try to interact with an element and it's not found, you'll get a LookupError.\nIn general stop your action after each button click to see what happens on your screenshot.\nNever try to login in a page.\n\nTo scroll up or down, use scroll_down or scroll_up with as an argument the number of pixels to scroll from.\n<code>\nscroll_down(num_pixels=1200) # This will scroll one viewport down\n</code>\n\nWhen you have pop-ups with a cross icon to close, don't try to click the close icon by finding its element or targeting an 'X' element (this most often fails).\nJust use your built-in tool `close_popups` to close them:\n<code>\nclose_popups()\n</code>\n\nYou can use .exists() to check for the existence of an element. For example:\n<code>\nif Text('Accept cookies?').exists():\n    click('I accept')\n</code>\n\nProceed in several steps rather than trying to solve the task in one shot.\nAnd at the end, only when you have your answer, return your final answer.\n<code>\nfinal_answer(\"YOUR_ANSWER_HERE\")\n</code>\n\nIf pages seem stuck on loading, you might have to wait, for instance `import time` and run `time.sleep(5.0)`. But don't overuse this!\nTo list elements on page, DO NOT try code-based element searches like 'contributors = find_all(S(\"ol > li\"))': just look at the latest screenshot you have and read it visually, or use your tool search_item_ctrl_f.\nOf course, you can act on buttons like a user would do when navigating.\nAfter each code blob you write, you will be automatically provided with an updated screenshot of the browser and the current browser url.\nBut beware that the screenshot will only be taken at the end of the whole action, it won't see intermediate states.\nDon't kill the browser.\nWhen you have modals or cookie banners on screen, you should get rid of them before you can click anything else.\n\"\"\"\n\n\ndef run_webagent(\n    prompt: str,\n    model_type: str,\n    model_id: str,\n    provider: str | None = None,\n    api_base: str | None = None,\n    api_key: str | None = None,\n) -> None:\n    # Load environment variables\n    load_dotenv()\n\n    # Initialize the model based on the provided arguments\n    model = load_model(model_type, model_id, provider=provider, api_base=api_base, api_key=api_key)\n\n    global driver\n    driver = initialize_driver()\n    agent = initialize_agent(model)\n\n    # Run the agent with the provided prompt\n    agent.python_executor(\"from helium import *\")\n    agent.run(prompt + helium_instructions)\n\n\ndef main() -> None:\n    # Parse command line arguments\n    args = parse_arguments()\n    run_webagent(args.prompt, args.model_type, args.model_id, args.provider, args.api_base, args.api_key)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "tests/__init__.py",
    "content": ""
  },
  {
    "path": "tests/conftest.py",
    "content": "from unittest.mock import patch\n\nimport pytest\n\nfrom smolagents.agents import MultiStepAgent\nfrom smolagents.monitoring import LogLevel\n\n\n# Import fixture modules as plugins\npytest_plugins = [\"tests.fixtures.agents\", \"tests.fixtures.tools\"]\n\noriginal_multi_step_agent_init = MultiStepAgent.__init__\n\n\n@pytest.fixture(autouse=True)\ndef patch_multi_step_agent_with_suppressed_logging():\n    with patch.object(MultiStepAgent, \"__init__\", autospec=True) as mock_init:\n\n        def init_with_suppressed_logging(self, *args, verbosity_level=LogLevel.OFF, **kwargs):\n            original_multi_step_agent_init(self, *args, verbosity_level=verbosity_level, **kwargs)\n\n        mock_init.side_effect = init_with_suppressed_logging\n        yield\n"
  },
  {
    "path": "tests/fixtures/agents.py",
    "content": "import pytest\n\n\nAGENT_DICTS = {\n    \"v1.9\": {\n        \"tools\": [],\n        \"model\": {\n            \"class\": \"InferenceClientModel\",\n            \"data\": {\n                \"last_input_token_count\": None,\n                \"last_output_token_count\": None,\n                \"model_id\": \"Qwen/Qwen2.5-Coder-32B-Instruct\",\n                \"provider\": None,\n            },\n        },\n        \"managed_agents\": {},\n        \"prompt_templates\": {\n            \"system_prompt\": \"dummy system prompt\",\n            \"planning\": {\n                \"initial_facts\": \"dummy planning initial facts\",\n                \"initial_plan\": \"dummy planning initial plan\",\n                \"update_facts_pre_messages\": \"dummy planning update facts pre messages\",\n                \"update_facts_post_messages\": \"dummy planning update facts post messages\",\n                \"update_plan_pre_messages\": \"dummy planning update plan pre messages\",\n                \"update_plan_post_messages\": \"dummy planning update plan post messages\",\n            },\n            \"managed_agent\": {\n                \"task\": \"dummy managed agent task\",\n                \"report\": \"dummy managed agent report\",\n            },\n            \"final_answer\": {\n                \"pre_messages\": \"dummy final answer pre messages\",\n                \"post_messages\": \"dummy final answer post messages\",\n            },\n        },\n        \"max_steps\": 10,\n        \"verbosity_level\": 2,\n        \"grammar\": None,\n        \"planning_interval\": 2,\n        \"name\": \"test_agent\",\n        \"description\": \"dummy description\",\n        \"requirements\": [\"smolagents\"],\n        \"authorized_imports\": [\"pandas\"],\n    },\n    # Added: executor_type, executor_kwargs, max_print_outputs_length\n    \"v1.10\": {\n        \"tools\": [],\n        \"model\": {\n            \"class\": \"InferenceClientModel\",\n            \"data\": {\n                \"last_input_token_count\": None,\n                \"last_output_token_count\": None,\n                \"model_id\": \"Qwen/Qwen2.5-Coder-32B-Instruct\",\n                \"provider\": None,\n            },\n        },\n        \"managed_agents\": {},\n        \"prompt_templates\": {\n            \"system_prompt\": \"dummy system prompt\",\n            \"planning\": {\n                \"initial_facts\": \"dummy planning initial facts\",\n                \"initial_plan\": \"dummy planning initial plan\",\n                \"update_facts_pre_messages\": \"dummy planning update facts pre messages\",\n                \"update_facts_post_messages\": \"dummy planning update facts post messages\",\n                \"update_plan_pre_messages\": \"dummy planning update plan pre messages\",\n                \"update_plan_post_messages\": \"dummy planning update plan post messages\",\n            },\n            \"managed_agent\": {\n                \"task\": \"dummy managed agent task\",\n                \"report\": \"dummy managed agent report\",\n            },\n            \"final_answer\": {\n                \"pre_messages\": \"dummy final answer pre messages\",\n                \"post_messages\": \"dummy final answer post messages\",\n            },\n        },\n        \"max_steps\": 10,\n        \"verbosity_level\": 2,\n        \"grammar\": None,\n        \"planning_interval\": 2,\n        \"name\": \"test_agent\",\n        \"description\": \"dummy description\",\n        \"requirements\": [\"smolagents\"],\n        \"authorized_imports\": [\"pandas\"],\n        \"executor_type\": \"local\",\n        \"executor_kwargs\": {},\n        \"max_print_outputs_length\": None,\n    },\n    # Removed: grammar, last_input_token_count, last_output_token_count\n    \"v1.20\": {\n        \"tools\": [],\n        \"model\": {\n            \"class\": \"InferenceClientModel\",\n            \"data\": {\n                \"model_id\": \"Qwen/Qwen2.5-Coder-32B-Instruct\",\n                \"provider\": None,\n            },\n        },\n        \"managed_agents\": {},\n        \"prompt_templates\": {\n            \"system_prompt\": \"dummy system prompt\",\n            \"planning\": {\n                \"initial_facts\": \"dummy planning initial facts\",\n                \"initial_plan\": \"dummy planning initial plan\",\n                \"update_facts_pre_messages\": \"dummy planning update facts pre messages\",\n                \"update_facts_post_messages\": \"dummy planning update facts post messages\",\n                \"update_plan_pre_messages\": \"dummy planning update plan pre messages\",\n                \"update_plan_post_messages\": \"dummy planning update plan post messages\",\n            },\n            \"managed_agent\": {\n                \"task\": \"dummy managed agent task\",\n                \"report\": \"dummy managed agent report\",\n            },\n            \"final_answer\": {\n                \"pre_messages\": \"dummy final answer pre messages\",\n                \"post_messages\": \"dummy final answer post messages\",\n            },\n        },\n        \"max_steps\": 10,\n        \"verbosity_level\": 2,\n        \"planning_interval\": 2,\n        \"name\": \"test_agent\",\n        \"description\": \"dummy description\",\n        \"requirements\": [\"smolagents\"],\n        \"authorized_imports\": [\"pandas\"],\n        \"executor_type\": \"local\",\n        \"executor_kwargs\": {},\n        \"max_print_outputs_length\": None,\n    },\n}\n\n\n@pytest.fixture\ndef get_agent_dict():\n    def _get_agent_dict(agent_dict_key):\n        return AGENT_DICTS[agent_dict_key]\n\n    return _get_agent_dict\n"
  },
  {
    "path": "tests/fixtures/tools.py",
    "content": "import pytest\n\nfrom smolagents.tools import Tool, tool\n\n\n@pytest.fixture\ndef test_tool():\n    class TestTool(Tool):\n        name = \"test_tool\"\n        description = \"A test tool\"\n        inputs = {\"input\": {\"type\": \"string\", \"description\": \"Input value\"}}\n        output_type = \"string\"\n\n        def forward(self, input):\n            if input == \"error\":\n                raise ValueError(\"Tool execution error\")\n            return f\"Processed: {input}\"\n\n    return TestTool()\n\n\n@pytest.fixture\ndef no_input_tool():\n    class NoInputTool(Tool):\n        name = \"no_input_tool\"\n        description = \"Tool with no inputs\"\n        inputs = {}\n        output_type = \"string\"\n\n        def forward(self):\n            return \"test\"\n\n    return NoInputTool()\n\n\n@pytest.fixture\ndef single_input_tool():\n    class SingleInputTool(Tool):\n        name = \"single_input_tool\"\n        description = \"Tool with one input\"\n        inputs = {\"text\": {\"type\": \"string\", \"description\": \"Input text\"}}\n        output_type = \"string\"\n\n        def forward(self, text):\n            return \"test\"\n\n    return SingleInputTool()\n\n\n@pytest.fixture\ndef multi_input_tool():\n    class MultiInputTool(Tool):\n        name = \"multi_input_tool\"\n        description = \"Tool with multiple inputs\"\n        inputs = {\n            \"text\": {\"type\": \"string\", \"description\": \"Text input\"},\n            \"count\": {\"type\": \"integer\", \"description\": \"Number count\"},\n        }\n        output_type = \"object\"\n\n        def forward(self, text, count):\n            return \"test\"\n\n    return MultiInputTool()\n\n\n@pytest.fixture\ndef multiline_description_tool():\n    class MultilineDescriptionTool(Tool):\n        name = \"multiline_description_tool\"\n        description = \"This is a tool with\\nmultiple lines\\nin the description\"\n        inputs = {\"input\": {\"type\": \"string\", \"description\": \"Some input\"}}\n        output_type = \"string\"\n\n        def forward(self, input):\n            return \"test\"\n\n    return MultilineDescriptionTool()\n\n\n@pytest.fixture\ndef example_tool():\n    @tool\n    def valid_tool_function(input: str) -> str:\n        \"\"\"A valid tool function.\n\n        Args:\n            input (str): Input string.\n        \"\"\"\n        return input.upper()\n\n    return valid_tool_function\n\n\n@pytest.fixture\ndef boolean_default_tool_class():\n    class BooleanDefaultTool(Tool):\n        name = \"boolean_default_tool\"\n        description = \"A tool with a boolean default parameter\"\n        inputs = {\n            \"text\": {\"type\": \"string\", \"description\": \"Input text\"},\n            \"flag\": {\"type\": \"boolean\", \"description\": \"Boolean flag with default value\", \"nullable\": True},\n        }\n        output_type = \"string\"\n\n        def forward(self, text: str, flag: bool = False) -> str:\n            return f\"Text: {text}, Flag: {flag}\"\n\n    return BooleanDefaultTool()\n\n\n@pytest.fixture\ndef boolean_default_tool_function():\n    @tool\n    def boolean_default_tool(text: str, flag: bool = False) -> str:\n        \"\"\"\n        A tool with a boolean default parameter.\n\n        Args:\n            text: Input text\n            flag: Boolean flag with default value\n        \"\"\"\n        return f\"Text: {text}, Flag: {flag}\"\n\n    return boolean_default_tool\n\n\n@pytest.fixture\ndef optional_input_tool_class():\n    class OptionalInputTool(Tool):\n        name = \"optional_input_tool\"\n        description = \"A tool with an optional input parameter\"\n        inputs = {\n            \"required_text\": {\"type\": \"string\", \"description\": \"Required input text\"},\n            \"optional_text\": {\"type\": \"string\", \"description\": \"Optional input text\", \"nullable\": True},\n        }\n        output_type = \"string\"\n\n        def forward(self, required_text: str, optional_text: str | None = None) -> str:\n            if optional_text:\n                return f\"{required_text} + {optional_text}\"\n            return required_text\n\n    return OptionalInputTool()\n\n\n@pytest.fixture\ndef optional_input_tool_function():\n    @tool\n    def optional_input_tool(required_text: str, optional_text: str | None = None) -> str:\n        \"\"\"\n        A tool with an optional input parameter.\n\n        Args:\n            required_text: Required input text\n            optional_text: Optional input text\n        \"\"\"\n        if optional_text:\n            return f\"{required_text} + {optional_text}\"\n        return required_text\n\n    return optional_input_tool\n"
  },
  {
    "path": "tests/test_agents.py",
    "content": "# coding=utf-8\n# Copyright 2024 HuggingFace Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\nimport io\nimport json\nimport os\nimport re\nimport tempfile\nimport uuid\nfrom collections.abc import Generator\nfrom contextlib import nullcontext as does_not_raise\nfrom dataclasses import dataclass\nfrom pathlib import Path\nfrom textwrap import dedent\nfrom typing import Optional\nfrom unittest.mock import MagicMock, patch\n\nimport pytest\nfrom huggingface_hub import (\n    ChatCompletionOutputFunctionDefinition,\n    ChatCompletionOutputMessage,\n    ChatCompletionOutputToolCall,\n)\nfrom rich.console import Console\n\nfrom smolagents import EMPTY_PROMPT_TEMPLATES\nfrom smolagents.agent_types import AgentImage, AgentText\nfrom smolagents.agents import (\n    AgentError,\n    AgentMaxStepsError,\n    AgentToolCallError,\n    CodeAgent,\n    MultiStepAgent,\n    RunResult,\n    ToolCall,\n    ToolCallingAgent,\n    ToolOutput,\n    populate_template,\n)\nfrom smolagents.default_tools import DuckDuckGoSearchTool, FinalAnswerTool, PythonInterpreterTool, VisitWebpageTool\nfrom smolagents.memory import (\n    ActionStep,\n    CallbackRegistry,\n    FinalAnswerStep,\n    MemoryStep,\n    PlanningStep,\n    SystemPromptStep,\n    TaskStep,\n)\nfrom smolagents.models import (\n    ChatMessage,\n    ChatMessageToolCall,\n    ChatMessageToolCallFunction,\n    InferenceClientModel,\n    MessageRole,\n    Model,\n    TransformersModel,\n)\nfrom smolagents.monitoring import AgentLogger, LogLevel, Timing, TokenUsage\nfrom smolagents.tools import Tool, tool\nfrom smolagents.utils import (\n    BASE_BUILTIN_MODULES,\n    AgentExecutionError,\n    AgentGenerationError,\n    AgentToolExecutionError,\n)\n\n\n@dataclass\nclass ChoiceDeltaToolCallFunction:\n    arguments: Optional[str] = None\n    name: Optional[str] = None\n\n\n@dataclass\nclass ChoiceDeltaToolCall:\n    index: Optional[int] = None\n    id: Optional[str] = None\n    function: Optional[ChoiceDeltaToolCallFunction] = None\n    type: Optional[str] = None\n\n\n@dataclass\nclass ChoiceDelta:\n    content: Optional[str] = None\n    function_call: Optional[str] = None\n    refusal: Optional[str] = None\n    role: Optional[str] = None\n    tool_calls: Optional[list] = None\n\n\ndef get_new_path(suffix=\"\") -> str:\n    directory = tempfile.mkdtemp()\n    return os.path.join(directory, str(uuid.uuid4()) + suffix)\n\n\n@pytest.fixture\ndef agent_logger():\n    return AgentLogger(\n        LogLevel.DEBUG, console=Console(record=True, no_color=True, force_terminal=False, file=io.StringIO())\n    )\n\n\nclass FakeToolCallModel(Model):\n    def generate(self, messages, tools_to_call_from=None, stop_sequences=None):\n        if len(messages) < 3:\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"I will call the python interpreter.\",\n                tool_calls=[\n                    ChatMessageToolCall(\n                        id=\"call_0\",\n                        type=\"function\",\n                        function=ChatMessageToolCallFunction(\n                            name=\"python_interpreter\", arguments={\"code\": \"2*3.6452\"}\n                        ),\n                    )\n                ],\n            )\n        else:\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"I will return the final answer.\",\n                tool_calls=[\n                    ChatMessageToolCall(\n                        id=\"call_1\",\n                        type=\"function\",\n                        function=ChatMessageToolCallFunction(name=\"final_answer\", arguments={\"answer\": \"7.2904\"}),\n                    )\n                ],\n            )\n\n\nclass FakeToolCallModelImage(Model):\n    def generate(self, messages, tools_to_call_from=None, stop_sequences=None):\n        if len(messages) < 3:\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"\",\n                tool_calls=[\n                    ChatMessageToolCall(\n                        id=\"call_0\",\n                        type=\"function\",\n                        function=ChatMessageToolCallFunction(\n                            name=\"fake_image_generation_tool\",\n                            arguments={\"prompt\": \"An image of a cat\"},\n                        ),\n                    )\n                ],\n            )\n        else:\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"\",\n                tool_calls=[\n                    ChatMessageToolCall(\n                        id=\"call_1\",\n                        type=\"function\",\n                        function=ChatMessageToolCallFunction(name=\"final_answer\", arguments=\"image.png\"),\n                    )\n                ],\n            )\n\n\nclass FakeToolCallModelVL(Model):\n    def generate(self, messages, tools_to_call_from=None, stop_sequences=None):\n        if len(messages) < 3:\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"\",\n                tool_calls=[\n                    ChatMessageToolCall(\n                        id=\"call_0\",\n                        type=\"function\",\n                        function=ChatMessageToolCallFunction(\n                            name=\"fake_image_understanding_tool\",\n                            arguments={\n                                \"prompt\": \"What is in this image?\",\n                                \"image\": \"image.png\",\n                            },\n                        ),\n                    )\n                ],\n            )\n        else:\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"\",\n                tool_calls=[\n                    ChatMessageToolCall(\n                        id=\"call_1\",\n                        type=\"function\",\n                        function=ChatMessageToolCallFunction(name=\"final_answer\", arguments=\"The image is a cat.\"),\n                    )\n                ],\n            )\n\n\nclass FakeCodeModel(Model):\n    def generate(self, messages, stop_sequences=None):\n        prompt = str(messages)\n        if \"special_marker\" not in prompt:\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"\"\"\nThought: I should multiply 2 by 3.6452. special_marker\n<code>\nresult = 2**3.6452\n</code>\n\"\"\",\n            )\n        else:  # We're at step 2\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"\"\"\nThought: I can now answer the initial question\n<code>\nfinal_answer(7.2904)\n</code>\n\"\"\",\n            )\n\n\nclass FakeCodeModelImageGeneration(Model):\n    def generate(self, messages, stop_sequences=None):\n        prompt = str(messages)\n        if \"special_marker\" not in prompt:\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"\"\"\nThought: I should generate an image. special_marker\n<code>\nimage = image_generation_tool()\n</code>\n\"\"\",\n            )\n        else:  # We're at step 2\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"\"\"\nThought: I can now answer the initial question\n<code>\nfinal_answer(image)\n</code>\n\"\"\",\n            )\n\n\nclass FakeCodeModelPlanning(Model):\n    def generate(self, messages, stop_sequences=None):\n        prompt = str(messages)\n        if \"planning_marker\" not in prompt:\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"llm plan update planning_marker\",\n                token_usage=TokenUsage(input_tokens=10, output_tokens=10),\n            )\n        elif \"action_marker\" not in prompt:\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"\"\"\nThought: I should multiply 2 by 3.6452. action_marker\n<code>\nresult = 2**3.6452\n</code>\n\"\"\",\n                token_usage=TokenUsage(input_tokens=10, output_tokens=10),\n            )\n        else:\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"llm plan again\",\n                token_usage=TokenUsage(input_tokens=10, output_tokens=10),\n            )\n\n\nclass FakeCodeModelError(Model):\n    def generate(self, messages, stop_sequences=None):\n        prompt = str(messages)\n        if \"special_marker\" not in prompt:\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"\"\"\nThought: I should multiply 2 by 3.6452. special_marker\n<code>\nprint(\"Flag!\")\ndef error_function():\n    raise ValueError(\"error\")\n\nerror_function()\n</code>\n\"\"\",\n            )\n        else:  # We're at step 2\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"\"\"\nThought: I faced an error in the previous step.\n<code>\nfinal_answer(\"got an error\")\n</code>\n\"\"\",\n            )\n\n\nclass FakeCodeModelSyntaxError(Model):\n    def generate(self, messages, stop_sequences=None):\n        prompt = str(messages)\n        if \"special_marker\" not in prompt:\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"\"\"\nThought: I should multiply 2 by 3.6452. special_marker\n<code>\na = 2\nb = a * 2\n    print(\"Failing due to unexpected indent\")\nprint(\"Ok, calculation done!\")\n</code>\n\"\"\",\n            )\n        else:  # We're at step 2\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"\"\"\nThought: I can now answer the initial question\n<code>\nfinal_answer(\"got an error\")\n</code>\n\"\"\",\n            )\n\n\nclass FakeCodeModelImport(Model):\n    def generate(self, messages, stop_sequences=None):\n        return ChatMessage(\n            role=MessageRole.ASSISTANT,\n            content=\"\"\"\nThought: I can answer the question\n<code>\nimport numpy as np\nfinal_answer(\"got an error\")\n</code>\n\"\"\",\n        )\n\n\nclass FakeCodeModelFunctionDef(Model):\n    def generate(self, messages, stop_sequences=None):\n        prompt = str(messages)\n        if \"special_marker\" not in prompt:\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"\"\"\nThought: Let's define the function. special_marker\n<code>\nimport numpy as np\n\ndef moving_average(x, w):\n    return np.convolve(x, np.ones(w), 'valid') / w\n</code>\n    \"\"\",\n            )\n        else:  # We're at step 2\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"\"\"\nThought: I can now answer the initial question\n<code>\nx, w = [0, 1, 2, 3, 4, 5], 2\nres = moving_average(x, w)\nfinal_answer(res)\n</code>\n\"\"\",\n            )\n\n\nclass FakeCodeModelSingleStep(Model):\n    def generate(self, messages, stop_sequences=None):\n        return ChatMessage(\n            role=MessageRole.ASSISTANT,\n            content=\"\"\"\nThought: I should multiply 2 by 3.6452. special_marker\n<code>\nresult = python_interpreter(code=\"2*3.6452\")\nfinal_answer(result)\n```\n\"\"\",\n        )\n\n\nclass FakeCodeModelNoReturn(Model):\n    def generate(self, messages, stop_sequences=None):\n        return ChatMessage(\n            role=MessageRole.ASSISTANT,\n            content=\"\"\"\nThought: I should multiply 2 by 3.6452. special_marker\n<code>\nresult = python_interpreter(code=\"2*3.6452\")\nprint(result)\n```\n\"\"\",\n        )\n\n\nclass TestAgent:\n    def test_fake_toolcalling_agent(self):\n        agent = ToolCallingAgent(tools=[PythonInterpreterTool()], model=FakeToolCallModel())\n        output = agent.run(\"What is 2 multiplied by 3.6452?\")\n        assert isinstance(output, str)\n        assert \"7.2904\" in output\n        assert agent.memory.steps[0].task == \"What is 2 multiplied by 3.6452?\"\n        assert \"7.2904\" in agent.memory.steps[1].observations\n        assert agent.memory.steps[2].model_output == \"I will return the final answer.\"\n\n    def test_toolcalling_agent_handles_image_tool_outputs(self, shared_datadir):\n        import PIL.Image\n\n        @tool\n        def fake_image_generation_tool(prompt: str) -> PIL.Image.Image:\n            \"\"\"Tool that generates an image.\n\n            Args:\n                prompt: The prompt\n            \"\"\"\n\n            import PIL.Image\n\n            return PIL.Image.open(shared_datadir / \"000000039769.png\")\n\n        agent = ToolCallingAgent(tools=[fake_image_generation_tool], model=FakeToolCallModelImage())\n        output = agent.run(\"Make me an image.\")\n        assert isinstance(output, AgentImage)\n        assert isinstance(agent.state[\"image.png\"], PIL.Image.Image)\n\n    def test_toolcalling_agent_handles_image_inputs(self, shared_datadir):\n        import PIL.Image\n\n        image = PIL.Image.open(shared_datadir / \"000000039769.png\")  # dummy input\n\n        @tool\n        def fake_image_understanding_tool(prompt: str, image: PIL.Image.Image) -> str:\n            \"\"\"Tool that creates a caption for an image.\n\n            Args:\n                prompt: The prompt\n                image: The image\n            \"\"\"\n            return \"The image is a cat.\"\n\n        agent = ToolCallingAgent(tools=[fake_image_understanding_tool], model=FakeToolCallModelVL())\n        output = agent.run(\"Caption this image.\", images=[image])\n        assert output == \"The image is a cat.\"\n\n    def test_fake_code_agent(self):\n        agent = CodeAgent(tools=[PythonInterpreterTool()], model=FakeCodeModel(), verbosity_level=10)\n        output = agent.run(\"What is 2 multiplied by 3.6452?\")\n        assert isinstance(output, float)\n        assert output == 7.2904\n        assert agent.memory.steps[0].task == \"What is 2 multiplied by 3.6452?\"\n        assert agent.memory.steps[2].tool_calls == [\n            ToolCall(name=\"python_interpreter\", arguments=\"final_answer(7.2904)\", id=\"call_2\")\n        ]\n\n    def test_additional_args_added_to_task(self):\n        agent = CodeAgent(tools=[], model=FakeCodeModel())\n        agent.run(\n            \"What is 2 multiplied by 3.6452?\",\n            additional_args={\"instruction\": \"Remember this.\"},\n        )\n        assert \"Remember this\" in agent.task\n\n    def test_reset_conversations(self):\n        agent = CodeAgent(tools=[PythonInterpreterTool()], model=FakeCodeModel())\n        output = agent.run(\"What is 2 multiplied by 3.6452?\", reset=True)\n        assert output == 7.2904\n        assert len(agent.memory.steps) == 3\n\n        output = agent.run(\"What is 2 multiplied by 3.6452?\", reset=False)\n        assert output == 7.2904\n        assert len(agent.memory.steps) == 5\n\n        output = agent.run(\"What is 2 multiplied by 3.6452?\", reset=True)\n        assert output == 7.2904\n        assert len(agent.memory.steps) == 3\n\n    def test_setup_agent_with_empty_toolbox(self):\n        ToolCallingAgent(model=FakeToolCallModel(), tools=[])\n\n    def test_fails_max_steps(self):\n        agent = CodeAgent(\n            tools=[PythonInterpreterTool()],\n            model=FakeCodeModelNoReturn(),  # use this callable because it never ends\n            max_steps=5,\n        )\n        answer = agent.run(\"What is 2 multiplied by 3.6452?\")\n        assert len(agent.memory.steps) == 7  # Task step + 5 action steps + Final answer\n        assert type(agent.memory.steps[-1].error) is AgentMaxStepsError\n        assert isinstance(answer, str)\n\n        agent = CodeAgent(\n            tools=[PythonInterpreterTool()],\n            model=FakeCodeModelNoReturn(),  # use this callable because it never ends\n            max_steps=5,\n        )\n        answer = agent.run(\"What is 2 multiplied by 3.6452?\", max_steps=3)\n        assert len(agent.memory.steps) == 5  # Task step + 3 action steps + Final answer\n        assert type(agent.memory.steps[-1].error) is AgentMaxStepsError\n        assert isinstance(answer, str)\n\n    def test_tool_descriptions_get_baked_in_system_prompt(self):\n        tool = PythonInterpreterTool()\n        tool.name = \"fake_tool_name\"\n        tool.description = \"fake_tool_description\"\n        agent = CodeAgent(tools=[tool], model=FakeCodeModel())\n        agent.run(\"Empty task\")\n        assert agent.system_prompt is not None\n        assert f\"def {tool.name}(\" in agent.system_prompt\n        assert f'\"\"\"{tool.description}' in agent.system_prompt\n\n    def test_module_imports_get_baked_in_system_prompt(self):\n        agent = CodeAgent(tools=[], model=FakeCodeModel())\n        agent.run(\"Empty task\")\n        for module in BASE_BUILTIN_MODULES:\n            assert module in agent.system_prompt\n\n    def test_init_agent_with_different_toolsets(self):\n        toolset_1 = []\n        agent = CodeAgent(tools=toolset_1, model=FakeCodeModel())\n        assert len(agent.tools) == 1  # when no tools are provided, only the final_answer tool is added by default\n\n        toolset_2 = [PythonInterpreterTool(), PythonInterpreterTool()]\n        with pytest.raises(ValueError) as e:\n            agent = CodeAgent(tools=toolset_2, model=FakeCodeModel())\n        assert \"Each tool or managed_agent should have a unique name!\" in str(e)\n\n        with pytest.raises(ValueError) as e:\n            agent.name = \"python_interpreter\"\n            agent.description = \"empty\"\n            CodeAgent(tools=[PythonInterpreterTool()], model=FakeCodeModel(), managed_agents=[agent])\n        assert \"Each tool or managed_agent should have a unique name!\" in str(e)\n\n        # check that python_interpreter base tool does not get added to CodeAgent\n        agent = CodeAgent(tools=[], model=FakeCodeModel(), add_base_tools=True)\n        assert len(agent.tools) == 3  # added final_answer tool + search + visit_webpage\n\n        # check that python_interpreter base tool gets added to ToolCallingAgent\n        agent = ToolCallingAgent(tools=[], model=FakeCodeModel(), add_base_tools=True)\n        assert len(agent.tools) == 4  # added final_answer tool + search + visit_webpage\n\n    def test_function_persistence_across_steps(self):\n        agent = CodeAgent(\n            tools=[],\n            model=FakeCodeModelFunctionDef(),\n            max_steps=2,\n            additional_authorized_imports=[\"numpy\"],\n        )\n        res = agent.run(\"ok\")\n        assert res[0] == 0.5\n\n    def test_init_managed_agent(self):\n        agent = CodeAgent(tools=[], model=FakeCodeModelFunctionDef(), name=\"managed_agent\", description=\"Empty\")\n        assert agent.name == \"managed_agent\"\n        assert agent.description == \"Empty\"\n\n    def test_agent_description_gets_correctly_inserted_in_system_prompt(self):\n        managed_agent = CodeAgent(\n            tools=[], model=FakeCodeModelFunctionDef(), name=\"managed_agent\", description=\"Empty\"\n        )\n        manager_agent = CodeAgent(\n            tools=[],\n            model=FakeCodeModelFunctionDef(),\n            managed_agents=[managed_agent],\n        )\n        assert \"You can also give tasks to team members.\" not in managed_agent.system_prompt\n        assert \"{{managed_agents_descriptions}}\" not in managed_agent.system_prompt\n        assert \"You can also give tasks to team members.\" in manager_agent.system_prompt\n\n    def test_replay_shows_logs(self, agent_logger):\n        agent = CodeAgent(\n            tools=[],\n            model=FakeCodeModelImport(),\n            verbosity_level=0,\n            additional_authorized_imports=[\"numpy\"],\n            logger=agent_logger,\n        )\n        agent.run(\"Count to 3\")\n\n        str_output = agent_logger.console.export_text()\n\n        assert \"New run\" in str_output\n        assert 'final_answer(\"got' in str_output\n        assert \"</code>\" in str_output\n\n        agent = ToolCallingAgent(tools=[PythonInterpreterTool()], model=FakeToolCallModel(), verbosity_level=0)\n        agent.logger = agent_logger\n\n        agent.run(\"What is 2 multiplied by 3.6452?\")\n        agent.replay()\n\n        str_output = agent_logger.console.export_text()\n        assert \"arguments\" in str_output\n\n    def test_code_nontrivial_final_answer_works(self):\n        class FakeCodeModelFinalAnswer(Model):\n            def generate(self, messages, stop_sequences=None):\n                return ChatMessage(\n                    role=MessageRole.ASSISTANT,\n                    content=\"\"\"<code>\ndef nested_answer():\n    final_answer(\"Correct!\")\n\nnested_answer()\n</code>\"\"\",\n                )\n\n        agent = CodeAgent(tools=[], model=FakeCodeModelFinalAnswer())\n\n        output = agent.run(\"Count to 3\")\n        assert output == \"Correct!\"\n\n    def test_transformers_toolcalling_agent(self):\n        @tool\n        def weather_api(location: str, celsius: str = \"\") -> str:\n            \"\"\"\n            Gets the weather in the next days at given location.\n            Secretly this tool does not care about the location, it hates the weather everywhere.\n\n            Args:\n                location: the location\n                celsius: the temperature type\n            \"\"\"\n            return \"The weather is UNGODLY with torrential rains and temperatures below -10°C\"\n\n        model = TransformersModel(\n            model_id=\"HuggingFaceTB/SmolLM2-360M-Instruct\",\n            max_new_tokens=100,\n            device_map=\"auto\",\n            do_sample=False,\n        )\n        agent = ToolCallingAgent(model=model, tools=[weather_api], max_steps=1)\n        task = \"What is the weather in Paris? \"\n        agent.run(task)\n        assert agent.memory.steps[0].task == task\n        assert agent.memory.steps[1].tool_calls[0].name == \"weather_api\"\n        step_memory_dict = agent.memory.get_succinct_steps()[1]\n        assert step_memory_dict[\"model_output_message\"][\"tool_calls\"][0][\"function\"][\"name\"] == \"weather_api\"\n        assert step_memory_dict[\"model_output_message\"][\"raw\"][\"completion_kwargs\"][\"max_new_tokens\"] == 100\n        assert \"model_input_messages\" in agent.memory.get_full_steps()[1]\n        assert step_memory_dict[\"token_usage\"][\"total_tokens\"] > 100\n        assert step_memory_dict[\"timing\"][\"duration\"] > 0.1\n\n    def test_final_answer_checks(self):\n        error_string = \"failed with error\"\n\n        def check_always_fails(final_answer, memory, agent):\n            assert False, \"Error raised in check\"\n\n        agent = CodeAgent(model=FakeCodeModel(), tools=[], final_answer_checks=[check_always_fails])\n        agent.run(\"Dummy task.\")\n        assert error_string in str(agent.write_memory_to_messages())\n        assert \"Error raised in check\" in str(agent.write_memory_to_messages())\n\n        agent = CodeAgent(\n            model=FakeCodeModel(),\n            tools=[],\n            final_answer_checks=[lambda x, memory, agent: x == 7.2904],\n            verbosity_level=1000,\n        )\n        output = agent.run(\"Dummy task.\")\n        assert output == 7.2904  # Check that output is correct\n        assert len([step for step in agent.memory.steps if isinstance(step, ActionStep)]) == 2\n        assert error_string not in str(agent.write_memory_to_messages())\n\n    def test_final_answer_checks_with_agent_access(self):\n        \"\"\"Test that final answer checks can access agent properties.\"\"\"\n\n        def check_uses_agent_properties(final_answer, memory, agent):\n            # Access agent properties to validate the final answer\n            assert hasattr(agent, \"memory\"), \"Agent should have memory attribute\"\n            assert hasattr(agent, \"state\"), \"Agent should have state attribute\"\n            assert hasattr(agent, \"task\"), \"Agent should have task attribute\"\n\n            # Check that the final answer is related to the task\n            if isinstance(final_answer, str):\n                return len(final_answer) > 0\n            return True\n\n        def check_uses_agent_state(final_answer, memory, agent):\n            # Use agent state to validate the answer\n            if \"expected_answer\" in agent.state:\n                return final_answer == agent.state[\"expected_answer\"]\n            return True\n\n        # Test with a check that uses agent properties\n        agent = CodeAgent(model=FakeCodeModel(), tools=[], final_answer_checks=[check_uses_agent_properties])\n        output = agent.run(\"Dummy task.\")\n        assert output == 7.2904  # Should pass the check\n\n        # Test with a check that uses agent state\n        agent = CodeAgent(model=FakeCodeModel(), tools=[], final_answer_checks=[check_uses_agent_state])\n        agent.state[\"expected_answer\"] = 7.2904\n        output = agent.run(\"Dummy task.\")\n        assert output == 7.2904  # Should pass the check\n\n        # Test with a check that fails due to state mismatch\n        agent = CodeAgent(\n            model=FakeCodeModel(),\n            tools=[],\n            final_answer_checks=[check_uses_agent_state],\n            max_steps=3,  # Limit steps to avoid long test run\n        )\n        agent.state[\"expected_answer\"] = \"wrong answer\"\n        output = agent.run(\"Dummy task.\")\n\n        # The agent should have reached max steps and provided a final answer anyway\n        assert output is not None\n        # Check that there were failed validation attempts in the memory\n        failed_steps = [step for step in agent.memory.steps if hasattr(step, \"error\") and step.error is not None]\n        assert len(failed_steps) > 0, \"Expected some steps to have validation errors\"\n\n        # Check that at least one error message contains our check function name\n        error_messages = [str(step.error) for step in failed_steps if step.error is not None]\n        assert any(\"check_uses_agent_state failed\" in msg for msg in error_messages), (\n            \"Expected to find validation error message\"\n        )\n\n    def test_generation_errors_are_raised(self):\n        class FakeCodeModel(Model):\n            def generate(self, messages, stop_sequences=None):\n                assert False, \"Generation failed\"\n\n        agent = CodeAgent(model=FakeCodeModel(), tools=[])\n        with pytest.raises(AgentGenerationError) as e:\n            agent.run(\"Dummy task.\")\n        assert len(agent.memory.steps) == 2\n        assert \"Generation failed\" in str(e)\n\n    def test_planning_step_with_injected_memory(self):\n        \"\"\"Test that agent properly uses update plan prompts when memory is injected before a run.\n\n        This test verifies:\n        1. Planning steps are created with the correct frequency\n        2. Injected memory is included in planning context\n        3. Messages are properly formatted with expected roles and content\n        \"\"\"\n        planning_interval = 1\n        max_steps = 4\n        task = \"Continuous task\"\n        previous_task = \"Previous user request\"\n\n        # Create agent with planning capability\n        agent = CodeAgent(\n            tools=[],\n            planning_interval=planning_interval,\n            model=FakeCodeModelPlanning(),\n            max_steps=max_steps,\n        )\n\n        # Inject memory before run to simulate existing conversation history\n        previous_step = TaskStep(task=previous_task)\n        agent.memory.steps.append(previous_step)\n\n        # Run the agent\n        agent.run(task, reset=False)\n\n        # Extract and validate planning steps\n        planning_steps = [step for step in agent.memory.steps if isinstance(step, PlanningStep)]\n        assert len(planning_steps) > 2, \"Expected multiple planning steps to be generated\"\n\n        # Verify first planning step incorporates injected memory\n        first_planning_step = planning_steps[0]\n        input_messages = first_planning_step.model_input_messages\n\n        # Check message structure and content\n        assert len(input_messages) == 4, (\n            \"First planning step should have 4 messages: system-plan-pre-update + memory + task + user-plan-post-update\"\n        )\n\n        # Verify system message contains current task\n        system_message = input_messages[0]\n        assert system_message.role == \"system\", \"First message should have system role\"\n        assert task in system_message.content[0][\"text\"], f\"System message should contain the current task: '{task}'\"\n\n        # Verify memory message contains previous task\n        memory_message = input_messages[1]\n        assert previous_task in memory_message.content[0][\"text\"], (\n            f\"Memory message should contain previous task: '{previous_task}'\"\n        )\n\n        # Verify task message contains current task\n        task_message = input_messages[2]\n        assert task in task_message.content[0][\"text\"], f\"Task message should contain current task: '{task}'\"\n\n        # Verify user message for planning\n        user_message = input_messages[3]\n        assert user_message.role == \"user\", \"Fourth message should have user role\"\n\n        # Verify second planning step has more context from first agent actions\n        second_planning_step = planning_steps[1]\n        second_messages = second_planning_step.model_input_messages\n\n        # Check that conversation history is growing appropriately\n        assert len(second_messages) == 6, \"Second planning step should have 6 messages including tool interactions\"\n\n        # Verify all conversation elements are present\n        conversation_text = \"\".join([msg.content[0][\"text\"] for msg in second_messages if hasattr(msg, \"content\")])\n        assert previous_task in conversation_text, \"Previous task should be included in the conversation history\"\n        assert task in conversation_text, \"Current task should be included in the conversation history\"\n        assert \"tools\" in conversation_text, \"Tool interactions should be included in the conversation history\"\n\n\nclass CustomFinalAnswerTool(FinalAnswerTool):\n    def forward(self, answer) -> str:\n        return answer + \"CUSTOM\"\n\n\nclass MockTool(Tool):\n    def __init__(self, name):\n        self.name = name\n        self.description = \"Mock tool description\"\n        self.inputs = {}\n        self.output_type = \"string\"\n\n    def forward(self):\n        return \"Mock tool output\"\n\n\nclass MockAgent:\n    def __init__(self, name, tools, description=\"Mock agent description\"):\n        self.name = name\n        self.tools = {t.name: t for t in tools}\n        self.description = description\n\n\nclass DummyMultiStepAgent(MultiStepAgent):\n    def step(self, memory_step: ActionStep) -> Generator[None]:\n        yield None\n\n    def initialize_system_prompt(self):\n        pass\n\n\nclass FakeLLMModel(Model):\n    def __init__(self, give_token_usage: bool = True):\n        self.give_token_usage = give_token_usage\n\n    def generate(self, prompt, tools_to_call_from=None, **kwargs):\n        if tools_to_call_from is not None:\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"I will call the final_answer tool.\",\n                tool_calls=[\n                    ChatMessageToolCall(\n                        id=\"fake_id\",\n                        type=\"function\",\n                        function=ChatMessageToolCallFunction(\n                            name=\"final_answer\", arguments={\"answer\": \"This is the final answer.\"}\n                        ),\n                    )\n                ],\n                token_usage=TokenUsage(input_tokens=10, output_tokens=20) if self.give_token_usage else None,\n            )\n        else:\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"\"\"<code>\nfinal_answer('This is the final answer.')\n</code>\"\"\",\n                token_usage=TokenUsage(input_tokens=10, output_tokens=20) if self.give_token_usage else None,\n            )\n\n\nclass TestRunResult:\n    def test_backward_compatibility(self):\n        \"\"\"Test that RunResult handles deprecated 'messages' parameter correctly.\"\"\"\n\n        # Test 1: Using new 'steps' parameter (should work without warning)\n        result1 = RunResult(\n            output=\"test output\",\n            state=\"success\",\n            steps=[{\"type\": \"test\", \"content\": \"step1\"}],\n            token_usage=None,\n            timing=Timing(start_time=0.0, end_time=1.0),\n        )\n        assert result1.steps == [{\"type\": \"test\", \"content\": \"step1\"}]\n\n        # Test property access warning\n        with pytest.warns(FutureWarning, match=\"deprecated\"):\n            messages = result1.messages\n        assert messages == [{\"type\": \"test\", \"content\": \"step1\"}]\n\n        # Test 2: Using deprecated 'messages' parameter (should show deprecation warning)\n        with pytest.warns(FutureWarning, match=\"deprecated\"):\n            result2 = RunResult(\n                output=\"test output\",\n                state=\"success\",\n                messages=[{\"type\": \"test\", \"content\": \"message1\"}],\n                token_usage=None,\n                timing=Timing(start_time=0.0, end_time=1.0),\n            )\n        assert result2.steps == [{\"type\": \"test\", \"content\": \"message1\"}]\n\n        # Test 3: Using both 'steps' and 'messages' (should raise ValueError)\n        with pytest.raises(ValueError, match=\"Cannot specify both\"):\n            RunResult(\n                output=\"test output\",\n                state=\"success\",\n                steps=[{\"type\": \"test\", \"content\": \"step1\"}],\n                messages=[{\"type\": \"test\", \"content\": \"message1\"}],\n                token_usage=None,\n                timing=Timing(start_time=0.0, end_time=1.0),\n            )\n\n    @pytest.mark.parametrize(\"agent_class\", [CodeAgent, ToolCallingAgent])\n    def test_no_token_usage(self, agent_class):\n        agent = agent_class(\n            tools=[],\n            model=FakeLLMModel(give_token_usage=False),\n            max_steps=1,\n            return_full_result=True,\n        )\n\n        result = agent.run(\"Fake task\")\n\n        assert isinstance(result, RunResult)\n        assert result.output == \"This is the final answer.\"\n        assert result.state == \"success\"\n        assert result.token_usage is None\n        assert isinstance(result.steps, list)\n        assert result.timing.duration > 0\n\n    @pytest.mark.parametrize(\n        \"init_return_full_result,run_return_full_result,expect_runresult\",\n        [\n            (True, None, True),\n            (False, None, False),\n            (True, False, False),\n            (False, True, True),\n        ],\n    )\n    def test_full_result(self, init_return_full_result, run_return_full_result, expect_runresult):\n        agent = ToolCallingAgent(\n            tools=[],\n            model=FakeLLMModel(),\n            max_steps=1,\n            return_full_result=init_return_full_result,\n        )\n        result = agent.run(\"Fake task\", return_full_result=run_return_full_result)\n\n        if expect_runresult:\n            assert isinstance(result, RunResult)\n            assert result.output == \"This is the final answer.\"\n            assert result.state == \"success\"\n            assert result.token_usage == TokenUsage(input_tokens=10, output_tokens=20)\n            assert isinstance(result.steps, list)\n            assert result.timing.duration > 0\n        else:\n            assert isinstance(result, str)\n\n\nclass TestMultiStepAgent:\n    def test_instantiation_disables_logging_to_terminal(self):\n        fake_model = MagicMock()\n        agent = DummyMultiStepAgent(tools=[], model=fake_model)\n        assert agent.logger.level == -1, \"logging to terminal should be disabled for testing using a fixture\"\n\n    def test_instantiation_with_prompt_templates(self, prompt_templates):\n        agent = DummyMultiStepAgent(tools=[], model=MagicMock(), prompt_templates=prompt_templates)\n        assert agent.prompt_templates == prompt_templates\n        assert agent.prompt_templates[\"system_prompt\"] == \"This is a test system prompt.\"\n        assert \"managed_agent\" in agent.prompt_templates\n        assert agent.prompt_templates[\"managed_agent\"][\"task\"] == \"Task for {{name}}: {{task}}\"\n        assert agent.prompt_templates[\"managed_agent\"][\"report\"] == \"Report for {{name}}: {{final_answer}}\"\n\n    @pytest.mark.parametrize(\n        \"tools, expected_final_answer_tool\",\n        [([], FinalAnswerTool), ([CustomFinalAnswerTool()], CustomFinalAnswerTool)],\n    )\n    def test_instantiation_with_final_answer_tool(self, tools, expected_final_answer_tool):\n        agent = DummyMultiStepAgent(tools=tools, model=MagicMock())\n        assert \"final_answer\" in agent.tools\n        assert isinstance(agent.tools[\"final_answer\"], expected_final_answer_tool)\n\n    def test_system_prompt_property(self):\n        \"\"\"Test that system_prompt property is read-only and calls initialize_system_prompt.\"\"\"\n\n        class SimpleAgent(MultiStepAgent):\n            def initialize_system_prompt(self) -> str:\n                return \"Test system prompt\"\n\n            def step(self, memory_step: ActionStep) -> Generator[None]:\n                yield None\n\n        # Create a simple agent with mocked model\n        model = MagicMock()\n        agent = SimpleAgent(tools=[], model=model)\n\n        # Test reading the property works and calls initialize_system_prompt\n        assert agent.system_prompt == \"Test system prompt\"\n\n        # Test setting the property raises AttributeError with correct message\n        with pytest.raises(\n            AttributeError,\n            match=re.escape(\n                \"\"\"The 'system_prompt' property is read-only. Use 'self.prompt_templates[\"system_prompt\"]' instead.\"\"\"\n            ),\n        ):\n            agent.system_prompt = \"New system prompt\"\n\n        # assert \"read-only\" in str(exc_info.value)\n        # assert \"Use 'self.prompt_templates[\\\"system_prompt\\\"]' instead\" in str(exc_info.value)\n\n    @pytest.mark.parametrize(\n        \"step_callbacks, expected_registry_state\",\n        [\n            # Case 0: None as input (initializes empty registry)\n            (\n                None,\n                {\n                    \"MemoryStep\": 0,\n                    \"ActionStep\": 1,\n                    \"PlanningStep\": 0,\n                    \"TaskStep\": 0,\n                    \"SystemPromptStep\": 0,\n                    \"FinalAnswerStep\": 0,\n                },  # Only monitor.update_metrics is registered for ActionStep\n            ),\n            # Case 1: List of callbacks (registers only for ActionStep: backward compatibility)\n            (\n                [MagicMock(), MagicMock()],\n                {\n                    \"MemoryStep\": 0,\n                    \"ActionStep\": 3,\n                    \"PlanningStep\": 0,\n                    \"TaskStep\": 0,\n                    \"SystemPromptStep\": 0,\n                    \"FinalAnswerStep\": 0,\n                },\n            ),\n            # Case 2: Dict mapping specific step types to callbacks\n            (\n                {ActionStep: MagicMock(), PlanningStep: MagicMock()},\n                {\n                    \"MemoryStep\": 0,\n                    \"ActionStep\": 2,\n                    \"PlanningStep\": 1,\n                    \"TaskStep\": 0,\n                    \"SystemPromptStep\": 0,\n                    \"FinalAnswerStep\": 0,\n                },\n            ),\n            # Case 3: Dict with list of callbacks for a step type\n            (\n                {ActionStep: [MagicMock(), MagicMock()]},\n                {\n                    \"MemoryStep\": 0,\n                    \"ActionStep\": 3,\n                    \"PlanningStep\": 0,\n                    \"TaskStep\": 0,\n                    \"SystemPromptStep\": 0,\n                    \"FinalAnswerStep\": 0,\n                },\n            ),\n            # Case 4: Dict with mixed single and list callbacks\n            (\n                {ActionStep: MagicMock(), MemoryStep: [MagicMock(), MagicMock()]},\n                {\n                    \"MemoryStep\": 2,\n                    \"ActionStep\": 2,\n                    \"PlanningStep\": 0,\n                    \"TaskStep\": 0,\n                    \"SystemPromptStep\": 0,\n                    \"FinalAnswerStep\": 0,\n                },\n            ),\n        ],\n    )\n    def test_setup_step_callbacks(self, step_callbacks, expected_registry_state):\n        \"\"\"Test that _setup_step_callbacks correctly sets up the callback registry.\"\"\"\n        # Create a dummy agent\n        agent = DummyMultiStepAgent(tools=[], model=MagicMock())\n        # Mock the monitor\n        agent.monitor = MagicMock()\n\n        # Call the method\n        agent._setup_step_callbacks(step_callbacks)\n\n        # Check that step_callbacks is a CallbackRegistry\n        assert isinstance(agent.step_callbacks, CallbackRegistry)\n\n        # Count callbacks for each step type\n        actual_registry_state = {}\n        for step_type in [MemoryStep, ActionStep, PlanningStep, TaskStep, SystemPromptStep, FinalAnswerStep]:\n            callbacks = agent.step_callbacks._callbacks.get(step_type, [])\n            actual_registry_state[step_type.__name__] = len(callbacks)\n\n        # Verify registry state matches expected\n        assert actual_registry_state == expected_registry_state\n\n    def test_finalize_step_callbacks_with_list(self):\n        # Create mock callbacks\n        callback1 = MagicMock()\n        callback2 = MagicMock()\n\n        # Create a test agent with a list of callbacks\n        agent = DummyMultiStepAgent(tools=[], model=MagicMock(), step_callbacks=[callback1, callback2])\n\n        # Create steps of different types\n        action_step = ActionStep(step_number=1, timing=Timing(start_time=0.0))\n        planning_step = PlanningStep(\n            timing=Timing(start_time=1.0),\n            model_input_messages=[],\n            model_output_message=ChatMessage(role=\"assistant\", content=\"Test plan\"),\n            plan=\"Test planning step\",\n        )\n\n        # Test with ActionStep\n        agent._finalize_step(action_step)\n\n        # Verify all callbacks were called\n        callback1.assert_called_once_with(action_step, agent=agent)\n        callback2.assert_called_once_with(action_step, agent=agent)\n\n        # Reset mocks\n        callback1.reset_mock()\n        callback2.reset_mock()\n\n        # Test with PlanningStep\n        agent._finalize_step(planning_step)\n\n        # Verify all callbacks were called again with the planning step\n        callback1.assert_not_called()\n        callback2.assert_not_called()\n\n    def test_finalize_step_callbacks_by_type(self):\n        # Create mock callbacks for different step types\n        action_step_callback = MagicMock()\n        action_step_callback_2 = MagicMock()\n        planning_step_callback = MagicMock()\n        step_callback = MagicMock()\n        final_answer_step_callback = MagicMock()\n\n        # Register callbacks for different step types\n        step_callbacks = {\n            ActionStep: [action_step_callback, action_step_callback_2],\n            PlanningStep: planning_step_callback,\n            MemoryStep: step_callback,\n            FinalAnswerStep: final_answer_step_callback,\n        }\n        agent = DummyMultiStepAgent(tools=[], model=MagicMock(), step_callbacks=step_callbacks)\n\n        # Create steps of different types\n        action_step = ActionStep(step_number=1, timing=Timing(start_time=0.0))\n        planning_step = PlanningStep(\n            timing=Timing(start_time=1.0),\n            model_input_messages=[],\n            model_output_message=ChatMessage(role=\"assistant\", content=\"Test plan\"),\n            plan=\"Test planning step\",\n        )\n        final_answer_step = FinalAnswerStep(output=\"Sample output\")\n\n        # Test with ActionStep\n        agent._finalize_step(action_step)\n\n        # Verify correct callbacks were called\n        action_step_callback.assert_called_once_with(action_step, agent=agent)\n        action_step_callback_2.assert_called_once_with(action_step, agent=agent)\n        step_callback.assert_called_once_with(action_step, agent=agent)\n        planning_step_callback.assert_not_called()\n        final_answer_step_callback.assert_not_called()\n\n        # Reset mocks\n        action_step_callback.reset_mock()\n        action_step_callback_2.reset_mock()\n        planning_step_callback.reset_mock()\n        step_callback.reset_mock()\n        final_answer_step_callback.reset_mock()\n\n        # Test with PlanningStep\n        agent._finalize_step(planning_step)\n\n        # Verify correct callbacks were called\n        planning_step_callback.assert_called_once_with(planning_step, agent=agent)\n        step_callback.assert_called_once_with(planning_step, agent=agent)\n        action_step_callback.assert_not_called()\n        action_step_callback_2.assert_not_called()\n        final_answer_step_callback.assert_not_called()\n\n        # Reset mocks\n        action_step_callback.reset_mock()\n        action_step_callback_2.reset_mock()\n        planning_step_callback.reset_mock()\n        step_callback.reset_mock()\n        final_answer_step_callback.reset_mock()\n\n        # Test with PlanningStep\n        agent._finalize_step(final_answer_step)\n\n        # Verify correct callbacks were called\n        planning_step_callback.assert_not_called()\n        step_callback.assert_called_once_with(final_answer_step, agent=agent)\n        action_step_callback.assert_not_called()\n        action_step_callback_2.assert_not_called()\n        final_answer_step_callback.assert_called_once_with(final_answer_step, agent=agent)\n\n    def test_logs_display_thoughts_even_if_error(self):\n        class FakeJsonModelNoCall(Model):\n            def generate(self, messages, stop_sequences=None, tools_to_call_from=None):\n                return ChatMessage(\n                    role=MessageRole.ASSISTANT,\n                    content=\"\"\"I don't want to call tools today\"\"\",\n                    tool_calls=None,\n                    raw=\"\"\"I don't want to call tools today\"\"\",\n                )\n\n        agent_toolcalling = ToolCallingAgent(model=FakeJsonModelNoCall(), tools=[], max_steps=1, verbosity_level=10)\n        with agent_toolcalling.logger.console.capture() as capture:\n            agent_toolcalling.run(\"Dummy task\")\n        assert \"don't\" in capture.get() and \"want\" in capture.get()\n\n        class FakeCodeModelNoCall(Model):\n            def generate(self, messages, stop_sequences=None):\n                return ChatMessage(\n                    role=MessageRole.ASSISTANT,\n                    content=\"\"\"I don't want to write an action today\"\"\",\n                )\n\n        agent_code = CodeAgent(model=FakeCodeModelNoCall(), tools=[], max_steps=1, verbosity_level=10)\n        with agent_code.logger.console.capture() as capture:\n            agent_code.run(\"Dummy task\")\n        assert \"don't\" in capture.get() and \"want\" in capture.get()\n\n    def test_step_number(self):\n        fake_model = MagicMock()\n        fake_model.generate.return_value = ChatMessage(\n            role=MessageRole.ASSISTANT,\n            content=\"Model output.\",\n            tool_calls=None,\n            raw=\"Model output.\",\n            token_usage=None,\n        )\n        max_steps = 2\n        agent = CodeAgent(tools=[], model=fake_model, max_steps=max_steps)\n        assert hasattr(agent, \"step_number\"), \"step_number attribute should be defined\"\n        assert agent.step_number == 0, \"step_number should be initialized to 0\"\n        agent.run(\"Test task\")\n        assert hasattr(agent, \"step_number\"), \"step_number attribute should be defined\"\n        assert agent.step_number == max_steps + 1, \"step_number should be max_steps + 1 after run method is called\"\n\n    @pytest.mark.parametrize(\n        \"step, expected_messages_list\",\n        [\n            (\n                1,\n                [\n                    [\n                        ChatMessage(\n                            role=MessageRole.USER, content=[{\"type\": \"text\", \"text\": \"INITIAL_PLAN_USER_PROMPT\"}]\n                        ),\n                    ],\n                ],\n            ),\n            (\n                2,\n                [\n                    [\n                        ChatMessage(\n                            role=MessageRole.SYSTEM,\n                            content=[{\"type\": \"text\", \"text\": \"UPDATE_PLAN_SYSTEM_PROMPT\"}],\n                        ),\n                        ChatMessage(\n                            role=MessageRole.USER,\n                            content=[{\"type\": \"text\", \"text\": \"UPDATE_PLAN_USER_PROMPT\"}],\n                        ),\n                    ],\n                ],\n            ),\n        ],\n    )\n    def test_planning_step(self, step, expected_messages_list):\n        fake_model = MagicMock()\n        agent = CodeAgent(\n            tools=[],\n            model=fake_model,\n        )\n        task = \"Test task\"\n\n        planning_step = list(agent._generate_planning_step(task, is_first_step=(step == 1), step=step))[-1]\n        expected_message_texts = {\n            \"INITIAL_PLAN_USER_PROMPT\": populate_template(\n                agent.prompt_templates[\"planning\"][\"initial_plan\"],\n                variables=dict(\n                    task=task,\n                    tools=agent.tools,\n                    managed_agents=agent.managed_agents,\n                    answer_facts=planning_step.model_output_message.content,\n                ),\n            ),\n            \"UPDATE_PLAN_SYSTEM_PROMPT\": populate_template(\n                agent.prompt_templates[\"planning\"][\"update_plan_pre_messages\"], variables=dict(task=task)\n            ),\n            \"UPDATE_PLAN_USER_PROMPT\": populate_template(\n                agent.prompt_templates[\"planning\"][\"update_plan_post_messages\"],\n                variables=dict(\n                    task=task,\n                    tools=agent.tools,\n                    managed_agents=agent.managed_agents,\n                    facts_update=planning_step.model_output_message.content,\n                    remaining_steps=agent.max_steps - step,\n                ),\n            ),\n        }\n        for expected_messages in expected_messages_list:\n            for expected_message in expected_messages:\n                expected_message.content[0][\"text\"] = expected_message_texts[expected_message.content[0][\"text\"]]\n        assert isinstance(planning_step, PlanningStep)\n        expected_model_input_messages = expected_messages_list[0]\n        model_input_messages = planning_step.model_input_messages\n        assert isinstance(model_input_messages, list)\n        assert len(model_input_messages) == len(expected_model_input_messages)  # 2\n        for message, expected_message in zip(model_input_messages, expected_model_input_messages):\n            assert isinstance(message, ChatMessage)\n            assert message.role in MessageRole.__members__.values()\n            assert message.role == expected_message.role\n            assert isinstance(message.content, list)\n            for content, expected_content in zip(message.content, expected_message.content):\n                assert content == expected_content\n        # Test calls to model\n        assert len(fake_model.generate.call_args_list) == 1\n        for call_args, expected_messages in zip(fake_model.generate.call_args_list, expected_messages_list):\n            assert len(call_args.args) == 1\n            messages = call_args.args[0]\n            assert isinstance(messages, list)\n            assert len(messages) == len(expected_messages)\n            for message, expected_message in zip(messages, expected_messages):\n                assert isinstance(message, ChatMessage)\n                assert message.role in MessageRole.__members__.values()\n                assert message.role == expected_message.role\n                assert isinstance(message.content, list)\n                for content, expected_content in zip(message.content, expected_message.content):\n                    assert content == expected_content\n\n    @pytest.mark.parametrize(\n        \"expected_messages_list\",\n        [\n            [\n                [\n                    ChatMessage(\n                        role=MessageRole.SYSTEM,\n                        content=[{\"type\": \"text\", \"text\": \"FINAL_ANSWER_SYSTEM_PROMPT\"}],\n                    ),\n                    ChatMessage(\n                        role=MessageRole.USER,\n                        content=[{\"type\": \"text\", \"text\": \"FINAL_ANSWER_USER_PROMPT\"}],\n                    ),\n                ]\n            ],\n            [\n                [\n                    ChatMessage(\n                        role=MessageRole.SYSTEM,\n                        content=[\n                            {\"type\": \"text\", \"text\": \"FINAL_ANSWER_SYSTEM_PROMPT\"},\n                            {\"type\": \"image\", \"image\": \"image1.png\"},\n                        ],\n                    ),\n                    ChatMessage(\n                        role=MessageRole.USER,\n                        content=[{\"type\": \"text\", \"text\": \"FINAL_ANSWER_USER_PROMPT\"}],\n                    ),\n                ]\n            ],\n        ],\n    )\n    def test_provide_final_answer(self, expected_messages_list):\n        fake_model = MagicMock()\n        fake_model.generate.return_value = ChatMessage(\n            role=MessageRole.ASSISTANT,\n            content=\"Final answer.\",\n            tool_calls=None,\n            raw=\"Final answer.\",\n            token_usage=None,\n        )\n        agent = CodeAgent(\n            tools=[],\n            model=fake_model,\n        )\n        task = \"Test task\"\n        final_answer = agent.provide_final_answer(task).content\n        expected_message_texts = {\n            \"FINAL_ANSWER_SYSTEM_PROMPT\": agent.prompt_templates[\"final_answer\"][\"pre_messages\"],\n            \"FINAL_ANSWER_USER_PROMPT\": populate_template(\n                agent.prompt_templates[\"final_answer\"][\"post_messages\"], variables=dict(task=task)\n            ),\n        }\n        for expected_messages in expected_messages_list:\n            for expected_message in expected_messages:\n                for expected_content in expected_message.content:\n                    if \"text\" in expected_content:\n                        expected_content[\"text\"] = expected_message_texts[expected_content[\"text\"]]\n        assert final_answer == \"Final answer.\"\n        # Test calls to model\n        assert len(fake_model.generate.call_args_list) == 1\n        for call_args, expected_messages in zip(fake_model.generate.call_args_list, expected_messages_list):\n            assert len(call_args.args) == 1\n            messages = call_args.args[0]\n            assert isinstance(messages, list)\n            assert len(messages) == len(expected_messages)\n            for message, expected_message in zip(messages, expected_messages):\n                assert isinstance(message, ChatMessage)\n                assert message.role in MessageRole.__members__.values()\n                assert message.role == expected_message.role\n                assert isinstance(message.content, list)\n                for content, expected_content in zip(message.content, expected_message.content):\n                    assert content == expected_content\n\n    def test_interrupt(self):\n        fake_model = MagicMock()\n        fake_model.generate.return_value = ChatMessage(\n            role=MessageRole.ASSISTANT,\n            content=\"Model output.\",\n            tool_calls=None,\n            raw=\"Model output.\",\n            token_usage=None,\n        )\n\n        def interrupt_callback(memory_step, agent):\n            agent.interrupt()\n\n        agent = CodeAgent(\n            tools=[],\n            model=fake_model,\n            step_callbacks=[interrupt_callback],\n        )\n        with pytest.raises(AgentError) as e:\n            agent.run(\"Test task\")\n        assert \"Agent interrupted\" in str(e)\n\n    @pytest.mark.parametrize(\n        \"tools, managed_agents, name, expectation\",\n        [\n            # Valid case: no duplicates\n            (\n                [MockTool(\"tool1\"), MockTool(\"tool2\")],\n                [MockAgent(\"agent1\", [MockTool(\"tool3\")])],\n                \"test_agent\",\n                does_not_raise(),\n            ),\n            # Invalid case: duplicate tool names\n            ([MockTool(\"tool1\"), MockTool(\"tool1\")], [], \"test_agent\", pytest.raises(ValueError)),\n            # Invalid case: tool name same as managed agent name\n            (\n                [MockTool(\"tool1\")],\n                [MockAgent(\"tool1\", [MockTool(\"final_answer\")])],\n                \"test_agent\",\n                pytest.raises(ValueError),\n            ),\n            # Valid case: tool name same as managed agent's tool name\n            ([MockTool(\"tool1\")], [MockAgent(\"agent1\", [MockTool(\"tool1\")])], \"test_agent\", does_not_raise()),\n            # Invalid case: duplicate managed agent name and managed agent tool name\n            ([MockTool(\"tool1\")], [], \"tool1\", pytest.raises(ValueError)),\n            # Valid case: duplicate tool names across managed agents\n            (\n                [MockTool(\"tool1\")],\n                [\n                    MockAgent(\"agent1\", [MockTool(\"tool2\"), MockTool(\"final_answer\")]),\n                    MockAgent(\"agent2\", [MockTool(\"tool2\"), MockTool(\"final_answer\")]),\n                ],\n                \"test_agent\",\n                does_not_raise(),\n            ),\n        ],\n    )\n    def test_validate_tools_and_managed_agents(self, tools, managed_agents, name, expectation):\n        fake_model = MagicMock()\n        with expectation:\n            DummyMultiStepAgent(\n                tools=tools,\n                model=fake_model,\n                name=name,\n                managed_agents=managed_agents,\n            )\n\n    def test_from_dict(self):\n        # Create a test agent dictionary\n        agent_dict = {\n            \"model\": {\"class\": \"TransformersModel\", \"data\": {\"model_id\": \"test/model\"}},\n            \"tools\": [\n                {\n                    \"name\": \"valid_tool_function\",\n                    \"code\": 'from smolagents import Tool\\nfrom typing import Any, Optional\\n\\nclass SimpleTool(Tool):\\n    name = \"valid_tool_function\"\\n    description = \"A valid tool function.\"\\n    inputs = {\"input\":{\"type\":\"string\",\"description\":\"Input string.\"}}\\n    output_type = \"string\"\\n\\n    def forward(self, input: str) -> str:\\n        \"\"\"A valid tool function.\\n\\n        Args:\\n            input (str): Input string.\\n        \"\"\"\\n        return input.upper()',\n                    \"requirements\": {\"smolagents\"},\n                }\n            ],\n            \"managed_agents\": {},\n            \"prompt_templates\": EMPTY_PROMPT_TEMPLATES,\n            \"max_steps\": 15,\n            \"verbosity_level\": 2,\n            \"planning_interval\": 3,\n            \"name\": \"test_agent\",\n            \"description\": \"Test agent description\",\n        }\n\n        # Call from_dict: mock the MODEL_REGISTRY to return a mock model class\n        mock_model_class = MagicMock()\n        mock_model_instance = MagicMock()\n        mock_model_class.from_dict.return_value = mock_model_instance\n\n        with patch.dict(\"smolagents.models.MODEL_REGISTRY\", {\"TransformersModel\": mock_model_class}):\n            agent = DummyMultiStepAgent.from_dict(agent_dict)\n\n        # Verify the agent was created correctly\n        assert agent.model == mock_model_instance\n        assert mock_model_class.from_dict.call_args.args[0] == {\"model_id\": \"test/model\"}\n        assert agent.max_steps == 15\n        assert agent.logger.level == 2\n        assert agent.planning_interval == 3\n        assert agent.name == \"test_agent\"\n        assert agent.description == \"Test agent description\"\n        # Verify the tool was created correctly\n        assert sorted(agent.tools.keys()) == [\"final_answer\", \"valid_tool_function\"]\n        assert agent.tools[\"valid_tool_function\"].name == \"valid_tool_function\"\n        assert agent.tools[\"valid_tool_function\"].description == \"A valid tool function.\"\n        assert agent.tools[\"valid_tool_function\"].inputs == {\n            \"input\": {\"type\": \"string\", \"description\": \"Input string.\"}\n        }\n        assert agent.tools[\"valid_tool_function\"](\"test\") == \"TEST\"\n\n        # Test overriding with kwargs\n        with patch.dict(\"smolagents.models.MODEL_REGISTRY\", {\"TransformersModel\": mock_model_class}):\n            agent = DummyMultiStepAgent.from_dict(agent_dict, max_steps=30)\n        assert agent.max_steps == 30\n\n    def test_multiagent_to_dict_from_dict_roundtrip(self):\n        \"\"\"Test that to_dict() and from_dict() work correctly for agents with managed agents.\"\"\"\n        # Create a managed agent\n        managed_agent = CodeAgent(\n            tools=[], model=MagicMock(), name=\"managed_agent\", description=\"A managed agent for testing\", max_steps=5\n        )\n\n        # Create a main agent with the managed agent\n        main_agent = ToolCallingAgent(\n            tools=[],\n            managed_agents=[managed_agent],\n            model=MagicMock(),\n            name=\"main_agent\",\n            description=\"Main agent with managed agents\",\n            max_steps=10,\n        )\n\n        # Convert to dict\n        agent_dict = main_agent.to_dict()\n\n        # Verify managed_agents structure in dict\n        assert \"managed_agents\" in agent_dict\n        assert isinstance(agent_dict[\"managed_agents\"], list)\n        assert len(agent_dict[\"managed_agents\"]) == 1\n\n        managed_agent_dict = agent_dict[\"managed_agents\"][0]\n        assert managed_agent_dict[\"name\"] == \"managed_agent\"\n        assert managed_agent_dict[\"class\"] == \"CodeAgent\"\n        assert managed_agent_dict[\"description\"] == \"A managed agent for testing\"\n        assert managed_agent_dict[\"max_steps\"] == 5\n\n        # Test round-trip: from_dict should recreate the agent\n        # Mock the model class for the test\n        mock_model_class = MagicMock()\n        mock_model_instance = MagicMock()\n        mock_model_class.from_dict.return_value = mock_model_instance\n\n        with patch.dict(\"smolagents.models.MODEL_REGISTRY\", {\"MagicMock\": mock_model_class}):\n            recreated_agent = ToolCallingAgent.from_dict(agent_dict)\n\n        # Verify the recreated agent has the same structure\n        assert recreated_agent.name == \"main_agent\"\n        assert recreated_agent.description == \"Main agent with managed agents\"\n        assert recreated_agent.max_steps == 10\n        assert len(recreated_agent.managed_agents) == 1\n\n        recreated_managed_agent = list(recreated_agent.managed_agents.values())[0]\n        assert recreated_managed_agent.name == \"managed_agent\"\n        assert recreated_managed_agent.description == \"A managed agent for testing\"\n        assert recreated_managed_agent.max_steps == 5\n\n    def test_from_dict_invalid_model_class(self):\n        \"\"\"Test that from_dict raises ValueError with helpful message for invalid model class.\"\"\"\n        agent_dict = {\n            \"class\": \"CodeAgent\",\n            \"model\": {\"class\": \"InvalidModelClass\", \"data\": {}},\n            \"tools\": [],\n            \"managed_agents\": [],\n        }\n\n        with pytest.raises(ValueError) as exc_info:\n            CodeAgent.from_dict(agent_dict)\n\n        error_message = str(exc_info.value)\n        assert \"InvalidModelClass\" in error_message\n        assert \"Unknown model class\" in error_message\n        assert \"Supported models:\" in error_message\n\n    def test_from_dict_invalid_agent_class(self):\n        \"\"\"Test that from_dict raises ValueError with helpful message for invalid agent class.\"\"\"\n        # Create a valid agent first\n        agent = CodeAgent(tools=[], model=MagicMock(), name=\"test_agent\")\n        agent_dict = agent.to_dict()\n\n        # Add a managed agent with invalid class\n        agent_dict[\"managed_agents\"] = [\n            {\n                \"class\": \"InvalidAgentClass\",\n                \"model\": {\"class\": \"MagicMock\", \"data\": {}},\n                \"tools\": [],\n                \"managed_agents\": [],\n            }\n        ]\n\n        # Mock the model registry to allow the main agent's model and managed agent's model\n        mock_model_class = MagicMock()\n        mock_model_instance = MagicMock()\n        mock_model_class.from_dict.return_value = mock_model_instance\n\n        with patch.dict(\"smolagents.models.MODEL_REGISTRY\", {\"MagicMock\": mock_model_class}):\n            with pytest.raises(ValueError) as exc_info:\n                CodeAgent.from_dict(agent_dict)\n\n            error_message = str(exc_info.value)\n            assert \"InvalidAgentClass\" in error_message\n            assert \"Unknown agent class\" in error_message\n            assert \"Supported agents:\" in error_message\n\n\nclass TestToolCallingAgent:\n    def test_toolcalling_agent_instructions(self):\n        agent = ToolCallingAgent(tools=[], model=MagicMock(), instructions=\"Test instructions\")\n        assert agent.instructions == \"Test instructions\"\n        assert \"Test instructions\" in agent.system_prompt\n\n    def test_toolcalling_agent_passes_both_tools_and_managed_agents(self, test_tool):\n        \"\"\"Test that both tools and managed agents are passed to the model.\"\"\"\n        managed_agent = MagicMock()\n        managed_agent.name = \"managed_agent\"\n        model = MagicMock()\n        model.generate.return_value = ChatMessage(\n            role=MessageRole.ASSISTANT,\n            content=\"\",\n            tool_calls=[\n                ChatMessageToolCall(\n                    id=\"call_0\",\n                    type=\"function\",\n                    function=ChatMessageToolCallFunction(name=\"test_tool\", arguments={\"input\": \"test_value\"}),\n                )\n            ],\n        )\n        agent = ToolCallingAgent(tools=[test_tool], managed_agents=[managed_agent], model=model)\n        # Run the agent one step to trigger the model call\n        next(agent.run(\"Test task\", stream=True))\n        # Check that the model was called with both tools and managed agents:\n        # - Get all tool_to_call_from names passed to the model\n        tools_to_call_from_names = [tool.name for tool in model.generate.call_args.kwargs[\"tools_to_call_from\"]]\n        # - Verify both regular tools and managed agents are included\n        assert \"test_tool\" in tools_to_call_from_names  # The regular tool\n        assert \"managed_agent\" in tools_to_call_from_names  # The managed agent\n        assert \"final_answer\" in tools_to_call_from_names  # The final_answer tool (added by default)\n\n    @patch(\"huggingface_hub.InferenceClient\")\n    def test_toolcalling_agent_api(self, mock_inference_client):\n        mock_client = mock_inference_client.return_value\n        mock_response = mock_client.chat_completion.return_value\n        mock_response.choices[0].message = ChatCompletionOutputMessage(\n            role=MessageRole.ASSISTANT,\n            content='{\"name\": \"weather_api\", \"arguments\": {\"location\": \"Paris\", \"date\": \"today\"}}',\n        )\n        mock_response.usage.prompt_tokens = 10\n        mock_response.usage.completion_tokens = 20\n\n        model = InferenceClientModel(model_id=\"test-model\")\n\n        from smolagents import tool\n\n        @tool\n        def weather_api(location: str, date: str) -> str:\n            \"\"\"\n            Gets the weather in the next days at given location.\n            Args:\n                location: the location\n                date: the date\n            \"\"\"\n            return f\"The weather in {location} on date:{date} is sunny.\"\n\n        agent = ToolCallingAgent(model=model, tools=[weather_api], max_steps=1)\n        agent.run(\"What's the weather in Paris?\")\n        assert agent.memory.steps[0].task == \"What's the weather in Paris?\"\n        assert agent.memory.steps[1].tool_calls[0].name == \"weather_api\"\n        assert agent.memory.steps[1].tool_calls[0].arguments == {\"location\": \"Paris\", \"date\": \"today\"}\n        assert agent.memory.steps[1].observations == \"The weather in Paris on date:today is sunny.\"\n\n        mock_response.choices[0].message = ChatCompletionOutputMessage(\n            role=MessageRole.ASSISTANT,\n            content=None,\n            tool_calls=[\n                ChatCompletionOutputToolCall(\n                    function=ChatCompletionOutputFunctionDefinition(\n                        name=\"weather_api\", arguments='{\"location\": \"Paris\", \"date\": \"today\"}'\n                    ),\n                    id=\"call_0\",\n                    type=\"function\",\n                )\n            ],\n        )\n\n        agent.run(\"What's the weather in Paris?\")\n        assert agent.memory.steps[0].task == \"What's the weather in Paris?\"\n        assert agent.memory.steps[1].tool_calls[0].name == \"weather_api\"\n        assert agent.memory.steps[1].tool_calls[0].arguments == {\"location\": \"Paris\", \"date\": \"today\"}\n        assert agent.memory.steps[1].observations == \"The weather in Paris on date:today is sunny.\"\n\n    @patch(\"openai.OpenAI\")\n    def test_toolcalling_agent_stream_logs_multiple_tool_calls_observations(self, mock_openai_client, test_tool):\n        \"\"\"Test that ToolCallingAgent with stream_outputs=True logs the observations of all tool calls when multiple are called.\"\"\"\n        mock_client = mock_openai_client.return_value\n        from smolagents import OpenAIModel\n\n        # Mock streaming response with multiple tool calls\n        mock_deltas = [\n            ChoiceDelta(role=MessageRole.ASSISTANT),\n            ChoiceDelta(\n                tool_calls=[\n                    ChoiceDeltaToolCall(\n                        index=0,\n                        id=\"call_1\",\n                        function=ChoiceDeltaToolCallFunction(name=\"test_tool\"),\n                        type=\"function\",\n                    )\n                ]\n            ),\n            ChoiceDelta(\n                tool_calls=[ChoiceDeltaToolCall(index=0, function=ChoiceDeltaToolCallFunction(arguments='{\"in'))]\n            ),\n            ChoiceDelta(\n                tool_calls=[ChoiceDeltaToolCall(index=0, function=ChoiceDeltaToolCallFunction(arguments='put\"'))]\n            ),\n            ChoiceDelta(\n                tool_calls=[ChoiceDeltaToolCall(index=0, function=ChoiceDeltaToolCallFunction(arguments=': \"out'))]\n            ),\n            ChoiceDelta(\n                tool_calls=[ChoiceDeltaToolCall(index=0, function=ChoiceDeltaToolCallFunction(arguments=\"put1\"))]\n            ),\n            ChoiceDelta(\n                tool_calls=[ChoiceDeltaToolCall(index=0, function=ChoiceDeltaToolCallFunction(arguments='\"}'))]\n            ),\n            ChoiceDelta(\n                tool_calls=[\n                    ChoiceDeltaToolCall(\n                        index=1,\n                        id=\"call_2\",\n                        function=ChoiceDeltaToolCallFunction(name=\"test_tool\"),\n                        type=\"function\",\n                    )\n                ]\n            ),\n            ChoiceDelta(\n                tool_calls=[ChoiceDeltaToolCall(index=1, function=ChoiceDeltaToolCallFunction(arguments='{\"in'))]\n            ),\n            ChoiceDelta(\n                tool_calls=[ChoiceDeltaToolCall(index=1, function=ChoiceDeltaToolCallFunction(arguments='put\"'))]\n            ),\n            ChoiceDelta(\n                tool_calls=[ChoiceDeltaToolCall(index=1, function=ChoiceDeltaToolCallFunction(arguments=': \"out'))]\n            ),\n            ChoiceDelta(\n                tool_calls=[ChoiceDeltaToolCall(index=1, function=ChoiceDeltaToolCallFunction(arguments=\"put2\"))]\n            ),\n            ChoiceDelta(\n                tool_calls=[ChoiceDeltaToolCall(index=1, function=ChoiceDeltaToolCallFunction(arguments='\"}'))]\n            ),\n        ]\n\n        class MockChoice:\n            def __init__(self, delta):\n                self.delta = delta\n\n        class MockChunk:\n            def __init__(self, delta):\n                self.choices = [MockChoice(delta)]\n                self.usage = None\n\n        mock_client.chat.completions.create.return_value = (MockChunk(delta) for delta in mock_deltas)\n\n        # Mock usage for non-streaming fallback\n        mock_usage = MagicMock()\n        mock_usage.prompt_tokens = 10\n        mock_usage.completion_tokens = 20\n\n        model = OpenAIModel(model_id=\"fakemodel\")\n\n        agent = ToolCallingAgent(model=model, tools=[test_tool], max_steps=1, stream_outputs=True)\n        agent.run(\"Dummy task\")\n        assert agent.memory.steps[1].model_output_message.tool_calls[0].function.name == \"test_tool\"\n        assert agent.memory.steps[1].model_output_message.tool_calls[1].function.name == \"test_tool\"\n        assert agent.memory.steps[1].observations == \"Processed: output1\\nProcessed: output2\"\n\n    @patch(\"openai.OpenAI\")\n    def test_toolcalling_agent_final_answer_cannot_be_called_with_parallel_tool_calls(\n        self, mock_openai_client, test_tool\n    ):\n        \"\"\"Test that ToolCallingAgent with stream_outputs=True returns the all tool calls when multiple are called.\"\"\"\n        mock_client = mock_openai_client.return_value\n\n        from smolagents import OpenAIModel\n\n        class ExtendedChatMessage(ChatMessage):\n            def __init__(self, *args, usage, **kwargs):\n                super().__init__(*args, **kwargs)\n\n            def model_dump(self, include=None):\n                return super().model_dump_json()\n\n        class MockChoice:\n            def __init__(self, chat_message):\n                self.message = chat_message\n\n        class MockChatCompletion:\n            def __init__(self, chat_message):\n                self.choices = [MockChoice(chat_message)]\n                self.usage = MockTokenUsage(prompt_tokens=10, completion_tokens=20)\n\n        class MockTokenUsage:\n            def __init__(self, prompt_tokens, completion_tokens):\n                self.prompt_tokens = prompt_tokens\n                self.completion_tokens = completion_tokens\n\n        from dataclasses import asdict\n\n        class ExtendedChatCompletionOutputMessage(ChatCompletionOutputMessage):\n            def __init__(self, *args, usage, **kwargs):\n                super().__init__(*args, **kwargs)\n                self.usage = usage\n\n            def model_dump(self, include=None):\n                print(\"TOOL CALLS\", self.tool_calls)\n                return {\n                    \"role\": self.role,\n                    \"content\": self.content,\n                    \"tool_calls\": [asdict(tc) for tc in self.tool_calls],\n                }\n\n        mock_client.chat.completions.create.return_value = MockChatCompletion(\n            ExtendedChatCompletionOutputMessage(\n                role=MessageRole.ASSISTANT,\n                content=None,\n                tool_calls=[\n                    ChatMessageToolCall(\n                        id=\"call_0\",\n                        type=\"function\",\n                        function=ChatMessageToolCallFunction(name=\"test_tool\", arguments={\"input\": \"out1\"}),\n                    ),\n                    ChatMessageToolCall(\n                        id=\"1\",\n                        type=\"function\",\n                        function=ChatMessageToolCallFunction(name=\"final_answer\", arguments={\"answer\": \"out1\"}),\n                    ),\n                ],\n                usage=MockTokenUsage(prompt_tokens=10, completion_tokens=20),\n            )\n        )\n\n        model = OpenAIModel(model_id=\"fakemodel\")\n\n        agent = ToolCallingAgent(model=model, tools=[test_tool], max_steps=1)\n        agent.run(\"Dummy task\")\n        assert agent.memory.steps[1].error is not None\n        assert (\n            \"do not perform any other tool calls than the final answer tool call!\"\n            in agent.memory.steps[1].error.message\n        )\n\n    @patch(\"huggingface_hub.InferenceClient\")\n    def test_toolcalling_agent_api_misformatted_output(self, mock_inference_client):\n        \"\"\"Test that even misformatted json blobs don't interrupt the run for a ToolCallingAgent.\"\"\"\n        mock_client = mock_inference_client.return_value\n        mock_response = mock_client.chat_completion.return_value\n        mock_response.choices[0].message = ChatCompletionOutputMessage(\n            role=MessageRole.ASSISTANT,\n            content='{\"name\": weather_api\", \"arguments\": {\"location\": \"Paris\", \"date\": \"today\"}}',\n        )\n\n        mock_response.usage.prompt_tokens = 10\n        mock_response.usage.completion_tokens = 20\n\n        model = InferenceClientModel(model_id=\"test-model\")\n\n        logger = AgentLogger(console=Console(markup=False, no_color=True))\n\n        agent = ToolCallingAgent(model=model, tools=[], max_steps=2, verbosity_level=1, logger=logger)\n        with agent.logger.console.capture() as capture:\n            agent.run(\"What's the weather in Paris?\")\n        assert agent.memory.steps[0].task == \"What's the weather in Paris?\"\n        assert agent.memory.steps[1].tool_calls is None\n        assert \"The JSON blob you used is invalid\" in agent.memory.steps[1].error.message\n        assert \"Error while parsing\" in capture.get()\n        assert len(agent.memory.steps) == 4\n\n    @pytest.mark.skip(\n        reason=\"Test is not properly implemented (GH-1255) because fake_tools should have the same name. \"\n        \"Additionally, it uses CodeAgent instead of ToolCallingAgent (GH-1409)\"\n    )\n    def test_change_tools_after_init(self):\n        from smolagents import tool\n\n        @tool\n        def fake_tool_1() -> str:\n            \"\"\"Fake tool\"\"\"\n            return \"1\"\n\n        @tool\n        def fake_tool_2() -> str:\n            \"\"\"Fake tool\"\"\"\n            return \"2\"\n\n        class FakeCodeModel(Model):\n            def generate(self, messages, stop_sequences=None):\n                return ChatMessage(role=MessageRole.ASSISTANT, content=\"<code>\\nfinal_answer(fake_tool_1())\\n</code>\")\n\n        agent = CodeAgent(tools=[fake_tool_1], model=FakeCodeModel())\n\n        agent.tools[\"final_answer\"] = CustomFinalAnswerTool()\n        agent.tools[\"fake_tool_1\"] = fake_tool_2\n\n        answer = agent.run(\"Fake task.\")\n        assert answer == \"2CUSTOM\"\n\n    def test_custom_final_answer_with_custom_inputs(self, test_tool):\n        class CustomFinalAnswerToolWithCustomInputs(FinalAnswerTool):\n            inputs = {\n                \"answer1\": {\"type\": \"string\", \"description\": \"First part of the answer.\"},\n                \"answer2\": {\"type\": \"string\", \"description\": \"Second part of the answer.\"},\n            }\n\n            def forward(self, answer1: str, answer2: str) -> str:\n                return answer1 + \" and \" + answer2\n\n        model = MagicMock()\n        model.generate.return_value = ChatMessage(\n            role=MessageRole.ASSISTANT,\n            content=None,\n            tool_calls=[\n                ChatMessageToolCall(\n                    id=\"call_0\",\n                    type=\"function\",\n                    function=ChatMessageToolCallFunction(\n                        name=\"final_answer\", arguments={\"answer1\": \"1\", \"answer2\": \"2\"}\n                    ),\n                ),\n            ],\n        )\n        agent = ToolCallingAgent(tools=[test_tool, CustomFinalAnswerToolWithCustomInputs()], model=model)\n        answer = agent.run(\"Fake task.\")\n        assert answer == \"1 and 2\"\n        assert agent.memory.steps[-1].model_output_message.tool_calls[0].function.name == \"final_answer\"\n\n    @pytest.mark.parametrize(\n        \"test_case\",\n        [\n            # Case 0: Single valid tool call\n            {\n                \"tool_calls\": [\n                    ChatMessageToolCall(\n                        id=\"call_1\",\n                        type=\"function\",\n                        function=ChatMessageToolCallFunction(name=\"test_tool\", arguments={\"input\": \"test_value\"}),\n                    )\n                ],\n                \"expected_observations\": \"Processed: test_value\",\n                \"expected_final_outputs\": [\"Processed: test_value\"],\n                \"expected_error\": None,\n            },\n            # Case 1: Multiple tool calls\n            {\n                \"tool_calls\": [\n                    ChatMessageToolCall(\n                        id=\"call_1\",\n                        type=\"function\",\n                        function=ChatMessageToolCallFunction(name=\"test_tool\", arguments={\"input\": \"value1\"}),\n                    ),\n                    ChatMessageToolCall(\n                        id=\"call_2\",\n                        type=\"function\",\n                        function=ChatMessageToolCallFunction(name=\"test_tool\", arguments={\"input\": \"value2\"}),\n                    ),\n                ],\n                \"expected_observations\": \"Processed: value1\\nProcessed: value2\",\n                \"expected_final_outputs\": [\"Processed: value1\", \"Processed: value2\"],\n                \"expected_error\": None,\n            },\n            # Case 2: Invalid tool name\n            {\n                \"tool_calls\": [\n                    ChatMessageToolCall(\n                        id=\"call_1\",\n                        type=\"function\",\n                        function=ChatMessageToolCallFunction(name=\"nonexistent_tool\", arguments={\"input\": \"test\"}),\n                    )\n                ],\n                \"expected_error\": AgentToolExecutionError,\n            },\n            # Case 3: Tool execution error\n            {\n                \"tool_calls\": [\n                    ChatMessageToolCall(\n                        id=\"call_1\",\n                        type=\"function\",\n                        function=ChatMessageToolCallFunction(name=\"test_tool\", arguments={\"input\": \"error\"}),\n                    )\n                ],\n                \"expected_error\": AgentToolExecutionError,\n            },\n            # Case 4: Empty tool calls list\n            {\n                \"tool_calls\": [],\n                \"expected_observations\": \"\",\n                \"expected_final_outputs\": [],\n                \"expected_error\": None,\n            },\n            # Case 5: Final answer call\n            {\n                \"tool_calls\": [\n                    ChatMessageToolCall(\n                        id=\"call_1\",\n                        type=\"function\",\n                        function=ChatMessageToolCallFunction(\n                            name=\"final_answer\", arguments={\"answer\": \"This is the final answer\"}\n                        ),\n                    )\n                ],\n                \"expected_observations\": \"This is the final answer\",\n                \"expected_final_outputs\": [\"This is the final answer\"],\n                \"expected_error\": None,\n            },\n            # Case 6: Invalid arguments\n            {\n                \"tool_calls\": [\n                    ChatMessageToolCall(\n                        id=\"call_1\",\n                        type=\"function\",\n                        function=ChatMessageToolCallFunction(name=\"test_tool\", arguments={\"wrong_param\": \"value\"}),\n                    )\n                ],\n                \"expected_error\": AgentToolCallError,\n            },\n        ],\n    )\n    def test_process_tool_calls(self, test_case, test_tool):\n        # Create a ToolCallingAgent instance with the test tool\n        agent = ToolCallingAgent(tools=[test_tool], model=MagicMock())\n        # Create chat message with the specified tool calls for process_tool_calls\n        chat_message = ChatMessage(role=MessageRole.ASSISTANT, content=\"\", tool_calls=test_case[\"tool_calls\"])\n        # Create a memory step for process_tool_calls\n        memory_step = ActionStep(step_number=10, timing=\"mock_timing\", model_output=\"\")\n\n        # Process tool calls\n        if test_case[\"expected_error\"]:\n            with pytest.raises(test_case[\"expected_error\"]):\n                list(agent.process_tool_calls(chat_message, memory_step))\n        else:\n            final_outputs = list(agent.process_tool_calls(chat_message, memory_step))\n            assert memory_step.model_output == \"\"\n            assert memory_step.observations == test_case[\"expected_observations\"]\n            assert [\n                final_output.output for final_output in final_outputs if isinstance(final_output, ToolOutput)\n            ] == test_case[\"expected_final_outputs\"]\n            # Verify memory step tool calls were updated correctly\n            if test_case[\"tool_calls\"]:\n                assert memory_step.tool_calls == [\n                    ToolCall(name=tool_call.function.name, arguments=tool_call.function.arguments, id=tool_call.id)\n                    for tool_call in test_case[\"tool_calls\"]\n                ]\n\n\nclass TestCodeAgent:\n    def test_code_agent_instructions(self):\n        agent = CodeAgent(tools=[], model=MagicMock(), instructions=\"Test instructions\")\n        assert agent.instructions == \"Test instructions\"\n        assert \"Test instructions\" in agent.system_prompt\n\n        agent = CodeAgent(\n            tools=[], model=MagicMock(), instructions=\"Test instructions\", use_structured_outputs_internally=True\n        )\n        assert agent.instructions == \"Test instructions\"\n        assert \"Test instructions\" in agent.system_prompt\n\n    @pytest.mark.parametrize(\"provide_run_summary\", [False, True])\n    def test_call_with_provide_run_summary(self, provide_run_summary):\n        agent = CodeAgent(tools=[], model=MagicMock(), provide_run_summary=provide_run_summary)\n        assert agent.provide_run_summary is provide_run_summary\n        agent.name = \"test_agent\"\n        agent.run = MagicMock(return_value=\"Test output\")\n        agent.write_memory_to_messages = MagicMock(\n            return_value=[ChatMessage(role=MessageRole.ASSISTANT, content=\"Test summary\")]\n        )\n\n        result = agent(\"Test request\")\n        expected_summary = \"Here is the final answer from your managed agent 'test_agent':\\nTest output\"\n        if provide_run_summary:\n            expected_summary += (\n                \"\\n\\nFor more detail, find below a summary of this agent's work:\\n\"\n                \"<summary_of_work>\\n\\nTest summary\\n---\\n</summary_of_work>\"\n            )\n        assert result == expected_summary\n\n    def test_code_agent_image_output(self):\n        from PIL import Image\n\n        from smolagents import tool\n\n        @tool\n        def image_generation_tool():\n            \"\"\"Generate an image\"\"\"\n            return Image.new(\"RGB\", (100, 100), color=\"red\")\n\n        agent = CodeAgent(tools=[image_generation_tool], model=FakeCodeModelImageGeneration(), verbosity_level=1)\n        output = agent.run(\"Make me an image from the latest trend on google trends.\")\n        assert isinstance(output, Image.Image)\n\n    def test_errors_logging(self):\n        class FakeCodeModel(Model):\n            def generate(self, messages, stop_sequences=None):\n                return ChatMessage(role=MessageRole.ASSISTANT, content=\"<code>\\nsecret=3;['1', '2'][secret]\\n</code>\")\n\n        agent = CodeAgent(tools=[], model=FakeCodeModel(), verbosity_level=1)\n\n        with agent.logger.console.capture() as capture:\n            agent.run(\"Test request\")\n        # Verify [secret] is rendered literally, not interpreted as Rich markup (which would\n        # inject ANSI reset/re-apply codes like \\x1b[0m\\x1b[1;31m in place of the brackets).\n        assert \"[secret]\" in capture.get()\n\n    def test_missing_import_triggers_advice_in_error_log(self):\n        # Set explicit verbosity level to 1 to override the default verbosity level of -1 set in CI fixture\n        agent = CodeAgent(tools=[], model=FakeCodeModelImport(), verbosity_level=1)\n\n        with agent.logger.console.capture() as capture:\n            agent.run(\"Count to 3\")\n        str_output = capture.get()\n        assert \"`additional_authorized_imports`\" in str_output.replace(\"\\n\", \"\")\n\n    def test_errors_show_offending_line_and_error(self):\n        agent = CodeAgent(tools=[PythonInterpreterTool()], model=FakeCodeModelError())\n        output = agent.run(\"What is 2 multiplied by 3.6452?\")\n        assert isinstance(output, AgentText)\n        assert output == \"got an error\"\n        assert \"Code execution failed at line 'error_function()'\" in str(agent.memory.steps[1].error)\n        assert \"ValueError\" in str(agent.memory.steps)\n\n    def test_error_saves_previous_print_outputs(self):\n        agent = CodeAgent(tools=[PythonInterpreterTool()], model=FakeCodeModelError())\n        agent.run(\"What is 2 multiplied by 3.6452?\")\n        assert \"Flag!\" in str(agent.memory.steps[1].observations)\n\n    def test_syntax_error_show_offending_lines(self):\n        agent = CodeAgent(tools=[PythonInterpreterTool()], model=FakeCodeModelSyntaxError())\n        output = agent.run(\"What is 2 multiplied by 3.6452?\")\n        assert isinstance(output, AgentText)\n        assert output == \"got an error\"\n        assert '    print(\"Failing due to unexpected indent\")' in str(agent.memory.steps)\n        assert isinstance(agent.memory.steps[-2], ActionStep)\n        assert agent.memory.steps[-2].code_action == dedent(\"\"\"a = 2\nb = a * 2\n    print(\"Failing due to unexpected indent\")\nprint(\"Ok, calculation done!\")\"\"\")\n\n    def test_end_code_appending(self):\n        # Checking original output message\n        orig_output = FakeCodeModelNoReturn().generate([])\n        assert not orig_output.content.endswith(\"<end_code>\")\n\n        # Checking the step output\n        agent = CodeAgent(\n            tools=[PythonInterpreterTool()],\n            model=FakeCodeModelNoReturn(),\n            max_steps=1,\n        )\n        answer = agent.run(\"What is 2 multiplied by 3.6452?\")\n        assert answer\n\n        memory_steps = agent.memory.steps\n        actions_steps = [s for s in memory_steps if isinstance(s, ActionStep)]\n\n        outputs = [s.model_output for s in actions_steps if s.model_output]\n        assert outputs\n        assert all(o.endswith(\"</code>\") for o in outputs)\n\n        messages = [s.model_output_message for s in actions_steps if s.model_output_message]\n        assert messages\n        assert all(m.content.endswith(\"</code>\") for m in messages)\n\n    @pytest.mark.skip(\n        reason=\"Test is not properly implemented (GH-1255) because fake_tools should have the same name. \"\n    )\n    def test_change_tools_after_init(self):\n        from smolagents import tool\n\n        @tool\n        def fake_tool_1() -> str:\n            \"\"\"Fake tool\"\"\"\n            return \"1\"\n\n        @tool\n        def fake_tool_2() -> str:\n            \"\"\"Fake tool\"\"\"\n            return \"2\"\n\n        class FakeCodeModel(Model):\n            def generate(self, messages, stop_sequences=None):\n                return ChatMessage(role=MessageRole.ASSISTANT, content=\"<code>\\nfinal_answer(fake_tool_1())\\n</code>\")\n\n        agent = CodeAgent(tools=[fake_tool_1], model=FakeCodeModel())\n\n        agent.tools[\"final_answer\"] = CustomFinalAnswerTool()\n        agent.tools[\"fake_tool_1\"] = fake_tool_2\n\n        answer = agent.run(\"Fake task.\")\n        assert answer == \"2CUSTOM\"\n\n    def test_local_python_executor_with_custom_functions(self):\n        model = MagicMock()\n        model.generate.return_value = ChatMessage(\n            role=MessageRole.ASSISTANT,\n            content=\"\",\n            tool_calls=None,\n            raw=\"\",\n            token_usage=None,\n        )\n        agent = CodeAgent(tools=[], model=model, executor_kwargs={\"additional_functions\": {\"open\": open}})\n        agent.run(\"Test run\")\n        assert \"open\" in agent.python_executor.static_tools\n\n    @pytest.mark.parametrize(\"agent_dict_version\", [\"v1.9\", \"v1.10\", \"v1.20\"])\n    def test_from_folder(self, agent_dict_version, get_agent_dict):\n        agent_dict = get_agent_dict(agent_dict_version)\n        mock_model_class = MagicMock()\n        mock_model_instance = MagicMock()\n        mock_model_instance.model_id = \"Qwen/Qwen2.5-Coder-32B-Instruct\"\n        mock_model_class.from_dict.return_value = mock_model_instance\n\n        with (\n            patch(\"smolagents.agents.Path\") as mock_path,\n            patch.dict(\"smolagents.models.MODEL_REGISTRY\", {\"InferenceClientModel\": mock_model_class}),\n        ):\n            import json\n\n            mock_path.return_value.__truediv__.return_value.read_text.return_value = json.dumps(agent_dict)\n            agent = CodeAgent.from_folder(\"ignored_dummy_folder\")\n        assert isinstance(agent, CodeAgent)\n        assert agent.name == \"test_agent\"\n        assert agent.description == \"dummy description\"\n        assert agent.max_steps == 10\n        assert agent.planning_interval == 2\n        assert agent.additional_authorized_imports == [\"pandas\"]\n        assert \"pandas\" in agent.authorized_imports\n        assert agent.executor_type == \"local\"\n        assert agent.executor_kwargs == {}\n        assert agent.max_print_outputs_length is None\n        assert agent.managed_agents == {}\n        assert set(agent.tools.keys()) == {\"final_answer\"}\n        assert agent.model == mock_model_instance\n        assert mock_model_class.from_dict.call_args.args[0][\"model_id\"] == \"Qwen/Qwen2.5-Coder-32B-Instruct\"\n        assert agent.model.model_id == \"Qwen/Qwen2.5-Coder-32B-Instruct\"\n        assert agent.logger.level == 2\n        assert agent.prompt_templates[\"system_prompt\"] == \"dummy system prompt\"\n\n    def test_from_dict(self):\n        # Create a test agent dictionary\n        agent_dict = {\n            \"model\": {\"class\": \"InferenceClientModel\", \"data\": {\"model_id\": \"Qwen/Qwen2.5-Coder-32B-Instruct\"}},\n            \"tools\": [\n                {\n                    \"name\": \"valid_tool_function\",\n                    \"code\": 'from smolagents import Tool\\nfrom typing import Any, Optional\\n\\nclass SimpleTool(Tool):\\n    name = \"valid_tool_function\"\\n    description = \"A valid tool function.\"\\n    inputs = {\"input\":{\"type\":\"string\",\"description\":\"Input string.\"}}\\n    output_type = \"string\"\\n\\n    def forward(self, input: str) -> str:\\n        \"\"\"A valid tool function.\\n\\n        Args:\\n            input (str): Input string.\\n        \"\"\"\\n        return input.upper()',\n                    \"requirements\": {\"smolagents\"},\n                }\n            ],\n            \"managed_agents\": {},\n            \"prompt_templates\": EMPTY_PROMPT_TEMPLATES,\n            \"max_steps\": 15,\n            \"verbosity_level\": 2,\n            \"use_structured_output\": False,\n            \"planning_interval\": 3,\n            \"name\": \"test_code_agent\",\n            \"description\": \"Test code agent description\",\n            \"authorized_imports\": [\"pandas\", \"numpy\"],\n            \"executor_type\": \"local\",\n            \"executor_kwargs\": {\"max_print_outputs_length\": 10_000},\n            \"max_print_outputs_length\": 1000,\n        }\n\n        # Call from_dict\n        mock_model_class = MagicMock()\n        mock_model_instance = MagicMock()\n        mock_model_class.from_dict.return_value = mock_model_instance\n\n        with patch.dict(\"smolagents.models.MODEL_REGISTRY\", {\"InferenceClientModel\": mock_model_class}):\n            agent = CodeAgent.from_dict(agent_dict)\n\n        # Verify the agent was created correctly with CodeAgent-specific parameters\n        assert agent.model == mock_model_instance\n        assert agent.additional_authorized_imports == [\"pandas\", \"numpy\"]\n        assert agent.executor_type == \"local\"\n        assert agent.executor_kwargs == {\"max_print_outputs_length\": 10_000}\n        assert agent.max_print_outputs_length == 1000\n\n        # Test with missing optional parameters\n        minimal_agent_dict = {\n            \"model\": {\"class\": \"InferenceClientModel\", \"data\": {\"model_id\": \"Qwen/Qwen2.5-Coder-32B-Instruct\"}},\n            \"tools\": [],\n            \"managed_agents\": {},\n        }\n\n        with patch.dict(\"smolagents.models.MODEL_REGISTRY\", {\"InferenceClientModel\": mock_model_class}):\n            agent = CodeAgent.from_dict(minimal_agent_dict)\n        # Verify defaults are used\n        assert agent.max_steps == 20  # default from MultiStepAgent.__init__\n\n        # Test overriding with kwargs\n        with patch.dict(\"smolagents.models.MODEL_REGISTRY\", {\"InferenceClientModel\": mock_model_class}):\n            agent = CodeAgent.from_dict(\n                agent_dict,\n                additional_authorized_imports=[\"requests\"],\n                executor_kwargs={\"max_print_outputs_length\": 5_000},\n            )\n        assert agent.additional_authorized_imports == [\"requests\"]\n        assert agent.executor_kwargs == {\"max_print_outputs_length\": 5_000}\n\n    def test_custom_final_answer_with_custom_inputs(self):\n        class CustomFinalAnswerToolWithCustomInputs(FinalAnswerTool):\n            inputs = {\n                \"answer1\": {\"type\": \"string\", \"description\": \"First part of the answer.\"},\n                \"answer2\": {\"type\": \"string\", \"description\": \"Second part of the answer.\"},\n            }\n\n            def forward(self, answer1: str, answer2: str) -> str:\n                return answer1 + \"CUSTOM\" + answer2\n\n        model = MagicMock()\n        model.generate.return_value = ChatMessage(\n            role=MessageRole.ASSISTANT, content=\"<code>\\nfinal_answer(answer1='1', answer2='2')\\n</code>\"\n        )\n        agent = CodeAgent(tools=[CustomFinalAnswerToolWithCustomInputs()], model=model)\n        answer = agent.run(\"Fake task.\")\n        assert answer == \"1CUSTOM2\"\n\n    def test_use_structured_outputs_internally(self):\n        expected_code = \"print('Hello, world!')\"\n        model = MagicMock()\n        # mock structured output generation\n        model.generate.return_value = ChatMessage(\n            role=MessageRole.ASSISTANT,\n            content=json.dumps({\"thought\": \"LLM-generated thought\", \"code\": expected_code}),\n        )\n        agent = CodeAgent(\n            tools=[], model=model, use_structured_outputs_internally=True\n        )  # Use structured outputs internally\n        tool_call: ToolCall = next(\n            agent._step_stream(ActionStep(step_number=1, timing=\"mock_timing\", model_output=\"\"))\n        )\n        assert tool_call.arguments == expected_code\n\n\nclass TestMultiAgents:\n    def test_multiagents_save(self, tmp_path):\n        model = InferenceClientModel(model_id=\"Qwen/Qwen2.5-Coder-32B-Instruct\", max_tokens=2096, temperature=0.5)\n\n        web_agent = ToolCallingAgent(\n            model=model,\n            tools=[DuckDuckGoSearchTool(max_results=2), VisitWebpageTool()],\n            name=\"web_agent\",\n            description=\"does web searches\",\n        )\n        code_agent = CodeAgent(model=model, tools=[], name=\"useless\", description=\"does nothing in particular\")\n\n        agent = CodeAgent(\n            model=model,\n            tools=[],\n            additional_authorized_imports=[\"pandas\", \"datetime\"],\n            managed_agents=[web_agent, code_agent],\n            max_print_outputs_length=1000,\n            executor_type=\"local\",\n            executor_kwargs={\"max_print_outputs_length\": 10_000},\n        )\n        agent.save(tmp_path)\n\n        expected_structure = {\n            \"managed_agents\": {\n                \"useless\": {\"tools\": {\"files\": [\"final_answer.py\"]}, \"files\": [\"agent.json\", \"prompts.yaml\"]},\n                \"web_agent\": {\n                    \"tools\": {\"files\": [\"final_answer.py\", \"visit_webpage.py\", \"web_search.py\"]},\n                    \"files\": [\"agent.json\", \"prompts.yaml\"],\n                },\n            },\n            \"tools\": {\"files\": [\"final_answer.py\"]},\n            \"files\": [\"app.py\", \"requirements.txt\", \"agent.json\", \"prompts.yaml\"],\n        }\n\n        def verify_structure(current_path: Path, structure: dict):\n            for dir_name, contents in structure.items():\n                if dir_name != \"files\":\n                    # For directories, verify they exist and recurse into them\n                    dir_path = current_path / dir_name\n                    assert dir_path.exists(), f\"Directory {dir_path} does not exist\"\n                    assert dir_path.is_dir(), f\"{dir_path} is not a directory\"\n                    verify_structure(dir_path, contents)\n                else:\n                    # For files, verify each exists in the current path\n                    for file_name in contents:\n                        file_path = current_path / file_name\n                        assert file_path.exists(), f\"File {file_path} does not exist\"\n                        assert file_path.is_file(), f\"{file_path} is not a file\"\n\n        verify_structure(tmp_path, expected_structure)\n\n        # Test that re-loaded agents work as expected.\n        agent2 = CodeAgent.from_folder(tmp_path, planning_interval=5)\n        assert agent2.planning_interval == 5  # Check that kwargs are used\n        assert set(agent2.authorized_imports) == set([\"pandas\", \"datetime\"] + BASE_BUILTIN_MODULES)\n        assert agent2.max_print_outputs_length == 1000\n        assert agent2.executor_type == \"local\"\n        assert agent2.executor_kwargs == {\"max_print_outputs_length\": 10_000}\n        assert (\n            agent2.managed_agents[\"web_agent\"].tools[\"web_search\"].max_results == 10\n        )  # For now tool init parameters are forgotten\n        assert agent2.model.kwargs[\"temperature\"] == pytest.approx(0.5)\n\n    def test_multiagents(self):\n        class FakeModelMultiagentsManagerAgent(Model):\n            model_id = \"fake_model\"\n\n            def generate(\n                self,\n                messages,\n                stop_sequences=None,\n                tools_to_call_from=None,\n            ):\n                if tools_to_call_from is not None:\n                    if len(messages) < 3:\n                        return ChatMessage(\n                            role=MessageRole.ASSISTANT,\n                            content=\"\",\n                            tool_calls=[\n                                ChatMessageToolCall(\n                                    id=\"call_0\",\n                                    type=\"function\",\n                                    function=ChatMessageToolCallFunction(\n                                        name=\"search_agent\",\n                                        arguments=\"Who is the current US president?\",\n                                    ),\n                                )\n                            ],\n                        )\n                    else:\n                        assert \"Report on the current US president\" in str(messages)\n                        return ChatMessage(\n                            role=MessageRole.ASSISTANT,\n                            content=\"\",\n                            tool_calls=[\n                                ChatMessageToolCall(\n                                    id=\"call_0\",\n                                    type=\"function\",\n                                    function=ChatMessageToolCallFunction(\n                                        name=\"final_answer\", arguments=\"Final report.\"\n                                    ),\n                                )\n                            ],\n                        )\n                else:\n                    if len(messages) < 3:\n                        return ChatMessage(\n                            role=MessageRole.ASSISTANT,\n                            content=\"\"\"\nThought: Let's call our search agent.\n<code>\nresult = search_agent(\"Who is the current US president?\")\n</code>\n\"\"\",\n                        )\n                    else:\n                        assert \"Report on the current US president\" in str(messages)\n                        return ChatMessage(\n                            role=MessageRole.ASSISTANT,\n                            content=\"\"\"\nThought: Let's return the report.\n<code>\nfinal_answer(\"Final report.\")\n</code>\n\"\"\",\n                        )\n\n        manager_model = FakeModelMultiagentsManagerAgent()\n\n        class FakeModelMultiagentsManagedAgent(Model):\n            model_id = \"fake_model\"\n\n            def generate(\n                self,\n                messages,\n                tools_to_call_from=None,\n                stop_sequences=None,\n            ):\n                return ChatMessage(\n                    role=MessageRole.ASSISTANT,\n                    content=\"Here is the secret content: FLAG1\",\n                    tool_calls=[\n                        ChatMessageToolCall(\n                            id=\"call_0\",\n                            type=\"function\",\n                            function=ChatMessageToolCallFunction(\n                                name=\"final_answer\",\n                                arguments=\"Report on the current US president\",\n                            ),\n                        )\n                    ],\n                )\n\n        managed_model = FakeModelMultiagentsManagedAgent()\n\n        web_agent = ToolCallingAgent(\n            tools=[],\n            model=managed_model,\n            max_steps=10,\n            name=\"search_agent\",\n            description=\"Runs web searches for you. Give it your request as an argument. Make the request as detailed as needed, you can ask for thorough reports\",\n            verbosity_level=2,\n        )\n\n        manager_code_agent = CodeAgent(\n            tools=[],\n            model=manager_model,\n            managed_agents=[web_agent],\n            additional_authorized_imports=[\"time\", \"numpy\", \"pandas\"],\n        )\n\n        report = manager_code_agent.run(\"Fake question.\")\n        assert report == \"Final report.\"\n\n        manager_toolcalling_agent = ToolCallingAgent(\n            tools=[],\n            model=manager_model,\n            managed_agents=[web_agent],\n        )\n\n        with web_agent.logger.console.capture() as capture:\n            report = manager_toolcalling_agent.run(\"Fake question.\")\n        assert report == \"Final report.\"\n        assert \"FLAG1\" in capture.get()  # Check that managed agent's output is properly logged\n\n        # Test that visualization works\n        with manager_toolcalling_agent.logger.console.capture() as capture:\n            manager_toolcalling_agent.visualize()\n        assert \"├──\" in capture.get()\n\n\n@pytest.fixture\ndef prompt_templates():\n    return {\n        \"system_prompt\": \"This is a test system prompt.\",\n        \"managed_agent\": {\"task\": \"Task for {{name}}: {{task}}\", \"report\": \"Report for {{name}}: {{final_answer}}\"},\n        \"planning\": {\n            \"initial_plan\": \"The plan.\",\n            \"update_plan_pre_messages\": \"custom\",\n            \"update_plan_post_messages\": \"custom\",\n        },\n        \"final_answer\": {\"pre_messages\": \"custom\", \"post_messages\": \"custom\"},\n    }\n\n\n@pytest.mark.parametrize(\n    \"arguments\",\n    [\n        {},\n        {\"arg\": \"bar\"},\n        {None: None},\n        [1, 2, 3],\n    ],\n)\ndef test_tool_calling_agents_raises_tool_call_error_being_invoked_with_wrong_arguments(arguments):\n    @tool\n    def _sample_tool(prompt: str) -> str:\n        \"\"\"Tool that returns same string\n        Args:\n            prompt: The string to return\n        Returns:\n            The same string\n        \"\"\"\n\n        return prompt\n\n    agent = ToolCallingAgent(model=FakeToolCallModel(), tools=[_sample_tool])\n    with pytest.raises(AgentToolCallError):\n        agent.execute_tool_call(_sample_tool.name, arguments)\n\n\ndef test_tool_calling_agents_raises_agent_execution_error_when_tool_raises():\n    @tool\n    def _sample_tool(_: str) -> float:\n        \"\"\"Tool that fails\n\n        Args:\n            _: The pointless string\n        Returns:\n            Some number\n        \"\"\"\n\n        return 1 / 0\n\n    agent = ToolCallingAgent(model=FakeToolCallModel(), tools=[_sample_tool])\n    with pytest.raises(AgentExecutionError):\n        agent.execute_tool_call(_sample_tool.name, \"sample\")\n"
  },
  {
    "path": "tests/test_all_docs.py",
    "content": "# coding=utf-8\n# Copyright 2024 HuggingFace Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\nimport ast\nimport os\nimport re\nimport shutil\nimport subprocess\nimport tempfile\nimport traceback\nfrom pathlib import Path\n\nimport pytest\nfrom dotenv import load_dotenv\n\nfrom .utils.markers import require_run_all\n\n\nclass SubprocessCallException(Exception):\n    pass\n\n\ndef run_command(command: list[str], return_stdout=False, env=None):\n    \"\"\"\n    Runs command with subprocess.check_output and returns stdout if requested.\n    Properly captures and handles errors during command execution.\n    \"\"\"\n    for i, c in enumerate(command):\n        if isinstance(c, Path):\n            command[i] = str(c)\n\n    if env is None:\n        env = os.environ.copy()\n\n    try:\n        output = subprocess.check_output(command, stderr=subprocess.STDOUT, env=env)\n        if return_stdout:\n            if hasattr(output, \"decode\"):\n                output = output.decode(\"utf-8\")\n            return output\n    except subprocess.CalledProcessError as e:\n        raise SubprocessCallException(\n            f\"Command `{' '.join(command)}` failed with the following error:\\n\\n{e.output.decode()}\"\n        ) from e\n\n\nclass DocCodeExtractor:\n    \"\"\"Handles extraction and validation of Python code from markdown files.\"\"\"\n\n    @staticmethod\n    def extract_python_code(content: str) -> list[str]:\n        \"\"\"Extract Python code blocks from markdown content.\"\"\"\n        pattern = r\"```(?:python|py)\\n(.*?)\\n```\"\n        matches = re.finditer(pattern, content, re.DOTALL)\n        return [match.group(1).strip() for match in matches]\n\n    @staticmethod\n    def create_test_script(code_blocks: list[str], tmp_dir: str) -> Path:\n        \"\"\"Create a temporary Python script from code blocks.\"\"\"\n        combined_code = \"\\n\\n\".join(code_blocks)\n        assert len(combined_code) > 0, \"Code is empty!\"\n        tmp_file = Path(tmp_dir) / \"test_script.py\"\n\n        with open(tmp_file, \"w\", encoding=\"utf-8\") as f:\n            f.write(combined_code)\n\n        return tmp_file\n\n\n# Skip: slow tests + require API keys\n@require_run_all\nclass TestDocs:\n    \"\"\"Test case for documentation code testing.\"\"\"\n\n    @classmethod\n    def setup_class(cls):\n        cls._tmpdir = tempfile.mkdtemp()\n        cls.launch_args = [\"python3\"]\n        cls.docs_dir = Path(__file__).parent.parent / \"docs\" / \"source\" / \"en\"\n        cls.extractor = DocCodeExtractor()\n\n        if not cls.docs_dir.exists():\n            raise ValueError(f\"Docs directory not found at {cls.docs_dir}\")\n\n        load_dotenv()\n\n        cls.md_files = list(cls.docs_dir.rglob(\"*.md\")) + list(cls.docs_dir.rglob(\"*.mdx\"))\n        if not cls.md_files:\n            raise ValueError(f\"No markdown files found in {cls.docs_dir}\")\n\n    @classmethod\n    def teardown_class(cls):\n        shutil.rmtree(cls._tmpdir)\n\n    @pytest.mark.timeout(100)\n    def test_single_doc(self, doc_path: Path):\n        \"\"\"Test a single documentation file.\"\"\"\n        with open(doc_path, \"r\", encoding=\"utf-8\") as f:\n            content = f.read()\n\n        code_blocks = self.extractor.extract_python_code(content)\n        excluded_snippets = [\n            \"ToolCollection\",\n            \"image_generation_tool\",  # We don't want to run this expensive operation\n            \"from_langchain\",  # Langchain is not a dependency\n            \"while llm_should_continue(memory):\",  # This is pseudo code\n            \"ollama_chat/llama3.2\",  # Exclude ollama building in guided tour\n            \"model = TransformersModel(model_id=model_id)\",  # Exclude testing with transformers model\n            \"SmolagentsInstrumentor\",  # Exclude telemetry since it needs additional installs\n        ]\n        code_blocks = [\n            block\n            for block in code_blocks\n            if not any(\n                [snippet in block for snippet in excluded_snippets]\n            )  # Exclude these tools that take longer to run and add dependencies\n        ]\n        if len(code_blocks) == 0:\n            pytest.skip(f\"No Python code blocks found in {doc_path.name}\")\n\n        # Validate syntax of each block individually by parsing it\n        for i, block in enumerate(code_blocks, 1):\n            ast.parse(block)\n\n        # Create and execute test script\n        print(\"\\n\\nCollected code block:==========\\n\".join(code_blocks))\n        try:\n            code_blocks = [\n                (\n                    block.replace(\"<YOUR_HUGGINGFACEHUB_API_TOKEN>\", os.getenv(\"HF_TOKEN\"))\n                    .replace(\"YOUR_ANTHROPIC_API_KEY\", os.getenv(\"ANTHROPIC_API_KEY\"))\n                    .replace(\"{your_username}\", \"m-ric\")\n                )\n                for block in code_blocks\n            ]\n            test_script = self.extractor.create_test_script(code_blocks, self._tmpdir)\n            run_command(self.launch_args + [str(test_script)])\n\n        except SubprocessCallException as e:\n            pytest.fail(f\"\\nError while testing {doc_path.name}:\\n{str(e)}\")\n        except Exception:\n            pytest.fail(f\"\\nUnexpected error while testing {doc_path.name}:\\n{traceback.format_exc()}\")\n\n    @pytest.fixture(autouse=True)\n    def _setup(self):\n        \"\"\"Fixture to ensure temporary directory exists for each test.\"\"\"\n        os.makedirs(self._tmpdir, exist_ok=True)\n        yield\n        # Clean up test files after each test\n        for file in Path(self._tmpdir).glob(\"*\"):\n            file.unlink()\n\n\ndef pytest_generate_tests(metafunc):\n    \"\"\"Generate test cases for each markdown file.\"\"\"\n    if \"doc_path\" in metafunc.fixturenames:\n        test_class = metafunc.cls\n\n        # Initialize the class if needed\n        if not hasattr(test_class, \"md_files\"):\n            test_class.setup_class()\n\n        # Parameterize with the markdown files\n        metafunc.parametrize(\"doc_path\", test_class.md_files, ids=[f.stem for f in test_class.md_files])\n"
  },
  {
    "path": "tests/test_cli.py",
    "content": "from unittest.mock import patch\n\nimport pytest\n\nfrom smolagents.cli import load_model\nfrom smolagents.local_python_executor import CodeOutput, LocalPythonExecutor\nfrom smolagents.models import InferenceClientModel, LiteLLMModel, OpenAIModel, TransformersModel\n\n\n@pytest.fixture\ndef set_env_vars(monkeypatch):\n    monkeypatch.setenv(\"FIREWORKS_API_KEY\", \"test_fireworks_api_key\")\n    monkeypatch.setenv(\"HF_TOKEN\", \"test_hf_api_key\")\n\n\ndef test_load_model_openai_model(set_env_vars):\n    with patch(\"openai.OpenAI\") as MockOpenAI:\n        model = load_model(\"OpenAIModel\", \"test_model_id\")\n    assert isinstance(model, OpenAIModel)\n    assert model.model_id == \"test_model_id\"\n    assert MockOpenAI.call_count == 1\n    assert MockOpenAI.call_args.kwargs[\"base_url\"] == \"https://api.fireworks.ai/inference/v1\"\n    assert MockOpenAI.call_args.kwargs[\"api_key\"] == \"test_fireworks_api_key\"\n\n\ndef test_load_model_litellm_model():\n    model = load_model(\"LiteLLMModel\", \"test_model_id\", api_key=\"test_api_key\", api_base=\"https://api.test.com\")\n    assert isinstance(model, LiteLLMModel)\n    assert model.api_key == \"test_api_key\"\n    assert model.api_base == \"https://api.test.com\"\n    assert model.model_id == \"test_model_id\"\n\n\ndef test_load_model_transformers_model():\n    with (\n        patch(\n            \"transformers.AutoModelForImageTextToText.from_pretrained\",\n            side_effect=ValueError(\"Unrecognized configuration class\"),\n        ),\n        patch(\"transformers.AutoModelForCausalLM.from_pretrained\"),\n        patch(\"transformers.AutoTokenizer.from_pretrained\"),\n    ):\n        model = load_model(\"TransformersModel\", \"test_model_id\")\n    assert isinstance(model, TransformersModel)\n    assert model.model_id == \"test_model_id\"\n\n\ndef test_load_model_hf_api_model(set_env_vars):\n    with patch(\"huggingface_hub.InferenceClient\") as huggingface_hub_InferenceClient:\n        model = load_model(\"InferenceClientModel\", \"test_model_id\")\n    assert isinstance(model, InferenceClientModel)\n    assert model.model_id == \"test_model_id\"\n    assert huggingface_hub_InferenceClient.call_count == 1\n    assert huggingface_hub_InferenceClient.call_args.kwargs[\"token\"] == \"test_hf_api_key\"\n\n\ndef test_load_model_invalid_model_type():\n    with pytest.raises(ValueError, match=\"Unsupported model type: InvalidModel\"):\n        load_model(\"InvalidModel\", \"test_model_id\")\n\n\ndef test_cli_main(capsys):\n    with patch(\"smolagents.cli.load_model\") as mock_load_model:\n        mock_load_model.return_value = \"mock_model\"\n        with patch(\"smolagents.cli.CodeAgent\") as mock_code_agent:\n            from smolagents.cli import run_smolagent\n\n            run_smolagent(\"test_prompt\", [], \"InferenceClientModel\", \"test_model_id\", provider=\"hf-inference\")\n    # load_model\n    assert len(mock_load_model.call_args_list) == 1\n    assert mock_load_model.call_args.args == (\"InferenceClientModel\", \"test_model_id\")\n    assert mock_load_model.call_args.kwargs == {\"api_base\": None, \"api_key\": None, \"provider\": \"hf-inference\"}\n    # CodeAgent\n    assert len(mock_code_agent.call_args_list) == 1\n    assert mock_code_agent.call_args.args == ()\n    assert mock_code_agent.call_args.kwargs == {\n        \"tools\": [],\n        \"model\": \"mock_model\",\n        \"additional_authorized_imports\": None,\n        \"stream_outputs\": True,\n    }\n    # agent.run\n    assert len(mock_code_agent.return_value.run.call_args_list) == 1\n    assert mock_code_agent.return_value.run.call_args.args == (\"test_prompt\",)\n\n\ndef test_vision_web_browser_main():\n    with patch(\"smolagents.vision_web_browser.helium\"):\n        with patch(\"smolagents.vision_web_browser.load_model\") as mock_load_model:\n            mock_load_model.return_value = \"mock_model\"\n            with patch(\"smolagents.vision_web_browser.CodeAgent\") as mock_code_agent:\n                from smolagents.vision_web_browser import helium_instructions, run_webagent\n\n                run_webagent(\"test_prompt\", \"InferenceClientModel\", \"test_model_id\", provider=\"hf-inference\")\n    # load_model\n    assert len(mock_load_model.call_args_list) == 1\n    assert mock_load_model.call_args.args == (\"InferenceClientModel\", \"test_model_id\")\n    # CodeAgent\n    assert len(mock_code_agent.call_args_list) == 1\n    assert mock_code_agent.call_args.args == ()\n    assert len(mock_code_agent.call_args.kwargs[\"tools\"]) == 4\n    assert mock_code_agent.call_args.kwargs[\"model\"] == \"mock_model\"\n    assert mock_code_agent.call_args.kwargs[\"additional_authorized_imports\"] == [\"helium\"]\n    # agent.python_executor\n    assert len(mock_code_agent.return_value.python_executor.call_args_list) == 1\n    assert mock_code_agent.return_value.python_executor.call_args.args == (\"from helium import *\",)\n    assert LocalPythonExecutor([\"helium\"])(\"from helium import *\") == CodeOutput(\n        output=None, logs=\"\", is_final_answer=False\n    )\n    # agent.run\n    assert len(mock_code_agent.return_value.run.call_args_list) == 1\n    assert mock_code_agent.return_value.run.call_args.args == (\"test_prompt\" + helium_instructions,)\n"
  },
  {
    "path": "tests/test_default_tools.py",
    "content": "# coding=utf-8\n# Copyright 2024 HuggingFace Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\nimport unittest\n\nimport pytest\n\nfrom smolagents.agent_types import _AGENT_TYPE_MAPPING\nfrom smolagents.default_tools import (\n    DuckDuckGoSearchTool,\n    PythonInterpreterTool,\n    SpeechToTextTool,\n    VisitWebpageTool,\n    WikipediaSearchTool,\n)\nfrom smolagents.local_python_executor import ExecutionTimeoutError\n\nfrom .test_tools import ToolTesterMixin\nfrom .utils.markers import require_run_all\n\n\nclass DefaultToolTests(unittest.TestCase):\n    def test_visit_webpage(self):\n        arguments = {\"url\": \"https://huggingface.co/\"}\n        result = VisitWebpageTool()(arguments)\n        assert isinstance(result, str)\n        assert \"Hugging Face – The AI community building the future\" in result\n\n    @require_run_all\n    def test_ddgs_with_kwargs(self):\n        result = DuckDuckGoSearchTool(timeout=20)(\"DeepSeek parent company\")\n        assert isinstance(result, str)\n\n\nclass TestPythonInterpreterTool(ToolTesterMixin):\n    def setup_method(self):\n        self.tool = PythonInterpreterTool(authorized_imports=[\"numpy\"])\n        self.tool.setup()\n\n    def test_exact_match_arg(self):\n        result = self.tool(\"(2 / 2) * 4\")\n        assert result == \"Stdout:\\n\\nOutput: 4.0\"\n\n    def test_exact_match_kwarg(self):\n        result = self.tool(code=\"(2 / 2) * 4\")\n        assert result == \"Stdout:\\n\\nOutput: 4.0\"\n\n    def test_agent_type_output(self):\n        inputs = [\"2 * 2\"]\n        output = self.tool(*inputs, sanitize_inputs_outputs=True)\n        output_type = _AGENT_TYPE_MAPPING[self.tool.output_type]\n        assert isinstance(output, output_type)\n\n    def test_agent_types_inputs(self):\n        inputs = [\"2 * 2\"]\n        _inputs = []\n\n        for _input, expected_input in zip(inputs, self.tool.inputs.values()):\n            input_type = expected_input[\"type\"]\n            if isinstance(input_type, list):\n                _inputs.append([_AGENT_TYPE_MAPPING[_input_type](_input) for _input_type in input_type])\n            else:\n                _inputs.append(_AGENT_TYPE_MAPPING[input_type](_input))\n\n        # Should not raise an error\n        output = self.tool(*inputs, sanitize_inputs_outputs=True)\n        output_type = _AGENT_TYPE_MAPPING[self.tool.output_type]\n        assert isinstance(output, output_type)\n\n    def test_imports_work(self):\n        result = self.tool(\"import numpy as np\")\n        assert \"import from numpy is not allowed\" not in result.lower()\n\n    def test_unauthorized_imports_fail(self):\n        with pytest.raises(Exception) as e:\n            self.tool(\"import sympy as sp\")\n        assert \"sympy\" in str(e).lower()\n\n    def test_custom_timeout(self):\n        \"\"\"Test that PythonInterpreterTool respects custom timeout.\"\"\"\n        tool = PythonInterpreterTool(authorized_imports=[\"time\"], timeout_seconds=1)\n        tool.setup()\n\n        # Code that sleeps for 2 seconds should timeout with 1-second limit\n        code = \"\"\"\nimport time\ntime.sleep(2)\n\"\"\"\n        with pytest.raises(ExecutionTimeoutError, match=\"Code execution exceeded the maximum execution time\"):\n            tool(code)\n\n    def test_disabled_timeout(self):\n        \"\"\"Test that PythonInterpreterTool can disable timeout.\"\"\"\n        tool = PythonInterpreterTool(authorized_imports=[\"time\"], timeout_seconds=None)\n        tool.setup()\n\n        # Code should complete even without timeout\n        code = \"\"\"\nimport time\ntime.sleep(0.5)\nresult = \"completed\"\n\"\"\"\n        result = tool(code)\n        assert \"completed\" in result\n\n\nclass TestSpeechToTextTool:\n    def test_new_instance(self):\n        from transformers.models.whisper import WhisperForConditionalGeneration, WhisperProcessor\n\n        tool = SpeechToTextTool()\n        assert tool is not None\n        assert tool.pre_processor_class == WhisperProcessor\n        assert tool.model_class == WhisperForConditionalGeneration\n\n    def test_initialization(self):\n        from transformers.models.whisper import WhisperForConditionalGeneration, WhisperProcessor\n\n        tool = SpeechToTextTool(model=\"dummy_model_id\")\n        assert tool is not None\n        assert tool.pre_processor_class == WhisperProcessor\n        assert tool.model_class == WhisperForConditionalGeneration\n\n\n@pytest.mark.parametrize(\n    \"language, content_type, extract_format, query\",\n    [\n        (\"en\", \"summary\", \"HTML\", \"Python_(programming_language)\"),  # English, Summary Mode, HTML format\n        (\"en\", \"text\", \"WIKI\", \"Python_(programming_language)\"),  # English, Full Text Mode, WIKI format\n        (\"es\", \"summary\", \"HTML\", \"Python_(lenguaje_de_programación)\"),  # Spanish, Summary Mode, HTML format\n        (\"es\", \"text\", \"WIKI\", \"Python_(lenguaje_de_programación)\"),  # Spanish, Full Text Mode, WIKI format\n    ],\n)\ndef test_wikipedia_search(language, content_type, extract_format, query):\n    tool = WikipediaSearchTool(\n        user_agent=\"TestAgent (test@example.com)\",\n        language=language,\n        content_type=content_type,\n        extract_format=extract_format,\n    )\n\n    result = tool.forward(query)\n\n    assert isinstance(result, str), \"Output should be a string\"\n    assert \"✅ **Wikipedia Page:**\" in result, \"Response should contain Wikipedia page title\"\n    assert \"🔗 **Read more:**\" in result, \"Response should contain Wikipedia page URL\"\n\n    if content_type == \"summary\":\n        assert len(result.split()) < 1000, \"Summary mode should return a shorter text\"\n    if content_type == \"text\":\n        assert len(result.split()) > 1000, \"Full text mode should return a longer text\"\n"
  },
  {
    "path": "tests/test_final_answer.py",
    "content": "# coding=utf-8\n# Copyright 2024 HuggingFace Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\n\nimport numpy as np\nimport PIL.Image\nimport pytest\n\nfrom smolagents.agent_types import _AGENT_TYPE_MAPPING\nfrom smolagents.default_tools import FinalAnswerTool\n\nfrom .test_tools import ToolTesterMixin\nfrom .utils.markers import require_torch\n\n\nclass TestFinalAnswerTool(ToolTesterMixin):\n    def setup_method(self):\n        self.inputs = {\"answer\": \"Final answer\"}\n        self.tool = FinalAnswerTool()\n\n    def test_exact_match_arg(self):\n        result = self.tool(\"Final answer\")\n        assert result == \"Final answer\"\n\n    def test_exact_match_kwarg(self):\n        result = self.tool(answer=self.inputs[\"answer\"])\n        assert result == \"Final answer\"\n\n    @require_torch\n    def test_agent_type_output(self, inputs):\n        for input_type, input in inputs.items():\n            output = self.tool(**input, sanitize_inputs_outputs=True)\n            agent_type = _AGENT_TYPE_MAPPING[input_type]\n            assert isinstance(output, agent_type)\n\n    @pytest.fixture\n    def inputs(self, shared_datadir):\n        import torch\n\n        return {\n            \"string\": {\"answer\": \"Text input\"},\n            \"image\": {\"answer\": PIL.Image.open(shared_datadir / \"000000039769.png\").resize((512, 512))},\n            \"audio\": {\"answer\": torch.Tensor(np.ones(3000))},\n        }\n"
  },
  {
    "path": "tests/test_function_type_hints_utils.py",
    "content": "# coding=utf-8\n# Copyright 2024 HuggingFace Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\nfrom typing import Any\n\nimport pytest\n\nfrom smolagents._function_type_hints_utils import DocstringParsingException, get_imports, get_json_schema\n\n\n@pytest.fixture\ndef valid_func():\n    \"\"\"A well-formed function with docstring, type hints, and return block.\"\"\"\n\n    def multiply(x: int, y: float) -> float:\n        \"\"\"\n        Multiplies two numbers.\n\n        Args:\n            x: The first number.\n            y: The second number.\n        Returns:\n            Product of x and y.\n        \"\"\"\n        return x * y\n\n    return multiply\n\n\n@pytest.fixture\ndef no_docstring_func():\n    \"\"\"Function with no docstring.\"\"\"\n\n    def sample(x: int):\n        return x\n\n    return sample\n\n\n@pytest.fixture\ndef missing_arg_doc_func():\n    \"\"\"Function with docstring but missing an argument description.\"\"\"\n\n    def add(x: int, y: int):\n        \"\"\"\n        Adds two numbers.\n\n        Args:\n            x: The first number.\n        \"\"\"\n        return x + y\n\n    return add\n\n\n@pytest.fixture\ndef bad_return_func():\n    \"\"\"Function docstring with missing return description (allowed).\"\"\"\n\n    def do_nothing(x: str | None = None):\n        \"\"\"\n        Does nothing.\n\n        Args:\n            x: Some optional string.\n        \"\"\"\n        pass\n\n    return do_nothing\n\n\n@pytest.fixture\ndef complex_types_func():\n    def process_data(items: list[str], config: dict[str, float], point: tuple[int, int]) -> dict:\n        \"\"\"\n        Process some data.\n\n        Args:\n            items: List of items to process.\n            config: Configuration parameters.\n            point: A position as (x,y).\n\n        Returns:\n            Processed data result.\n        \"\"\"\n        return {\"result\": True}\n\n    return process_data\n\n\n@pytest.fixture\ndef optional_types_func():\n    def process_with_optional(required_arg: str, optional_arg: int | None = None) -> str:\n        \"\"\"\n        Process with optional argument.\n\n        Args:\n            required_arg: A required string argument.\n            optional_arg: An optional integer argument.\n\n        Returns:\n            Processing result.\n        \"\"\"\n        return \"processed\"\n\n    return process_with_optional\n\n\n@pytest.fixture\ndef enum_choices_func():\n    def select_color(color: str) -> str:\n        \"\"\"\n        Select a color.\n\n        Args:\n            color: The color to select (choices: [\"red\", \"green\", \"blue\"])\n\n        Returns:\n            Selected color.\n        \"\"\"\n        return color\n\n    return select_color\n\n\n@pytest.fixture\ndef union_types_func():\n    def process_union(value: int | str) -> bool | str:\n        \"\"\"\n        Process a value that can be either int or string.\n\n        Args:\n            value: An integer or string value.\n\n        Returns:\n            Processing result.\n        \"\"\"\n        return True if isinstance(value, int) else \"string result\"\n\n    return process_union\n\n\n@pytest.fixture\ndef nested_types_func():\n    def process_nested_data(data: list[dict[str, Any]]) -> list[str]:\n        \"\"\"\n        Process nested data structure.\n\n        Args:\n            data: List of dictionaries to process.\n\n        Returns:\n            List of processed results.\n        \"\"\"\n        return [\"result\"]\n\n    return process_nested_data\n\n\n@pytest.fixture\ndef typed_docstring_func():\n    def calculate(x: int, y: float) -> float:\n        \"\"\"\n        Calculate something.\n\n        Args:\n            x (int): An integer parameter with type in docstring.\n            y (float): A float parameter with type in docstring.\n\n        Returns:\n            float: The calculated result.\n        \"\"\"\n        return x * y\n\n    return calculate\n\n\n@pytest.fixture\ndef mismatched_types_func():\n    def convert(value: int) -> str:\n        \"\"\"\n        Convert a value.\n\n        Args:\n            value (str): A string value (type mismatch with hint).\n\n        Returns:\n            int: Converted value (type mismatch with hint).\n        \"\"\"\n        return str(value)\n\n    return convert\n\n\n@pytest.fixture\ndef complex_docstring_types_func():\n    def process(data: dict[str, list[int]]) -> list[dict[str, Any]]:\n        \"\"\"\n        Process complex data.\n\n        Args:\n            data (Dict[str, List[int]]): Nested structure with types.\n\n        Returns:\n            List[Dict[str, Any]]: Processed results with types.\n        \"\"\"\n        return [{\"result\": sum(v) for k, v in data.items()}]\n\n    return process\n\n\n@pytest.fixture\ndef keywords_in_description_func():\n    def process(value: str) -> str:\n        \"\"\"\n        Function with Args: or Returns: keywords in its description.\n\n        Args:\n            value: A string value.\n\n        Returns:\n            str: Processed value.\n        \"\"\"\n        return value.upper()\n\n    return process\n\n\nclass TestGetJsonSchema:\n    def test_get_json_schema_example(self):\n        def fn(x: int, y: tuple[str, str, float] | None = None) -> None:\n            \"\"\"\n            Test function\n            Args:\n                x: The first input\n                y: The second input\n            \"\"\"\n            pass\n\n        schema = get_json_schema(fn)\n        expected_schema = {\n            \"name\": \"fn\",\n            \"description\": \"Test function\",\n            \"parameters\": {\n                \"type\": \"object\",\n                \"properties\": {\n                    \"x\": {\"type\": \"integer\", \"description\": \"The first input\"},\n                    \"y\": {\n                        \"type\": \"array\",\n                        \"description\": \"The second input\",\n                        \"nullable\": True,\n                        \"prefixItems\": [{\"type\": \"string\"}, {\"type\": \"string\"}, {\"type\": \"number\"}],\n                    },\n                },\n                \"required\": [\"x\"],\n            },\n            \"return\": {\"type\": \"null\"},\n        }\n        assert schema[\"function\"][\"parameters\"][\"properties\"][\"y\"] == expected_schema[\"parameters\"][\"properties\"][\"y\"]\n        assert schema[\"function\"] == expected_schema\n\n    @pytest.mark.parametrize(\n        \"fixture_name,should_fail\",\n        [\n            (\"valid_func\", False),\n            # ('no_docstring_func', True),\n            # ('missing_arg_doc_func', True),\n            (\"bad_return_func\", False),\n        ],\n    )\n    def test_get_json_schema(self, request, fixture_name, should_fail):\n        func = request.getfixturevalue(fixture_name)\n        schema = get_json_schema(func)\n        assert schema[\"type\"] == \"function\"\n        assert \"function\" in schema\n        assert \"parameters\" in schema[\"function\"]\n\n    @pytest.mark.parametrize(\n        \"fixture_name,should_fail\",\n        [\n            # ('valid_func', False),\n            (\"no_docstring_func\", True),\n            (\"missing_arg_doc_func\", True),\n            # ('bad_return_func', False),\n        ],\n    )\n    def test_get_json_schema_raises(self, request, fixture_name, should_fail):\n        func = request.getfixturevalue(fixture_name)\n        with pytest.raises(DocstringParsingException):\n            get_json_schema(func)\n\n    @pytest.mark.parametrize(\n        \"fixture_name,expected_properties\",\n        [\n            (\"valid_func\", {\"x\": \"integer\", \"y\": \"number\"}),\n            (\"bad_return_func\", {\"x\": \"string\"}),\n        ],\n    )\n    def test_property_types(self, request, fixture_name, expected_properties):\n        \"\"\"Test that property types are correctly mapped.\"\"\"\n        func = request.getfixturevalue(fixture_name)\n        schema = get_json_schema(func)\n\n        properties = schema[\"function\"][\"parameters\"][\"properties\"]\n        for prop_name, expected_type in expected_properties.items():\n            assert properties[prop_name][\"type\"] == expected_type\n\n    def test_schema_basic_structure(self, valid_func):\n        \"\"\"Test that basic schema structure is correct.\"\"\"\n        schema = get_json_schema(valid_func)\n        # Check schema type\n        assert schema[\"type\"] == \"function\"\n        assert \"function\" in schema\n        # Check function schema\n        function_schema = schema[\"function\"]\n        assert function_schema[\"name\"] == \"multiply\"\n        assert \"description\" in function_schema\n        assert function_schema[\"description\"] == \"Multiplies two numbers.\"\n        # Check parameters schema\n        assert \"parameters\" in function_schema\n        params = function_schema[\"parameters\"]\n        assert params[\"type\"] == \"object\"\n        assert \"properties\" in params\n        assert \"required\" in params\n        assert set(params[\"required\"]) == {\"x\", \"y\"}\n        properties = params[\"properties\"]\n        assert properties[\"x\"][\"type\"] == \"integer\"\n        assert properties[\"y\"][\"type\"] == \"number\"\n        # Check return schema\n        assert \"return\" in function_schema\n        return_schema = function_schema[\"return\"]\n        assert return_schema[\"type\"] == \"number\"\n        assert return_schema[\"description\"] == \"Product of x and y.\"\n\n    def test_complex_types(self, complex_types_func):\n        \"\"\"Test schema generation for complex types.\"\"\"\n        schema = get_json_schema(complex_types_func)\n        properties = schema[\"function\"][\"parameters\"][\"properties\"]\n        # Check list type\n        assert properties[\"items\"][\"type\"] == \"array\"\n        # Check dict type\n        assert properties[\"config\"][\"type\"] == \"object\"\n        # Check tuple type\n        assert properties[\"point\"][\"type\"] == \"array\"\n        assert len(properties[\"point\"][\"prefixItems\"]) == 2\n        assert properties[\"point\"][\"prefixItems\"][0][\"type\"] == \"integer\"\n        assert properties[\"point\"][\"prefixItems\"][1][\"type\"] == \"integer\"\n\n    def test_optional_types(self, optional_types_func):\n        \"\"\"Test schema generation for optional arguments.\"\"\"\n        schema = get_json_schema(optional_types_func)\n        params = schema[\"function\"][\"parameters\"]\n        # Required argument should be in required list\n        assert \"required_arg\" in params[\"required\"]\n        # Optional argument should not be in required list\n        assert \"optional_arg\" not in params[\"required\"]\n        # Optional argument should be nullable\n        assert params[\"properties\"][\"optional_arg\"][\"nullable\"] is True\n        assert params[\"properties\"][\"optional_arg\"][\"type\"] == \"integer\"\n\n    def test_enum_choices(self, enum_choices_func):\n        \"\"\"Test schema generation for enum choices in docstring.\"\"\"\n        schema = get_json_schema(enum_choices_func)\n        color_prop = schema[\"function\"][\"parameters\"][\"properties\"][\"color\"]\n        assert \"enum\" in color_prop\n        assert color_prop[\"enum\"] == [\"red\", \"green\", \"blue\"]\n\n    def test_union_types(self, union_types_func):\n        \"\"\"Test schema generation for union types.\"\"\"\n        schema = get_json_schema(union_types_func)\n        value_prop = schema[\"function\"][\"parameters\"][\"properties\"][\"value\"]\n        return_prop = schema[\"function\"][\"return\"]\n        # Check union in parameter\n        assert len(value_prop[\"type\"]) == 2\n        # Check union in return type: should be converted to \"any\"\n        assert return_prop[\"type\"] == \"any\"\n\n    def test_nested_types(self, nested_types_func):\n        \"\"\"Test schema generation for nested complex types.\"\"\"\n        schema = get_json_schema(nested_types_func)\n        data_prop = schema[\"function\"][\"parameters\"][\"properties\"][\"data\"]\n        assert data_prop[\"type\"] == \"array\"\n\n    def test_typed_docstring_parsing(self, typed_docstring_func):\n        \"\"\"Test parsing of docstrings with type annotations.\"\"\"\n        schema = get_json_schema(typed_docstring_func)\n        # Type hints should take precedence over docstring types\n        assert schema[\"function\"][\"parameters\"][\"properties\"][\"x\"][\"type\"] == \"integer\"\n        assert schema[\"function\"][\"parameters\"][\"properties\"][\"y\"][\"type\"] == \"number\"\n        # Description should be extracted correctly\n        assert (\n            schema[\"function\"][\"parameters\"][\"properties\"][\"x\"][\"description\"]\n            == \"An integer parameter with type in docstring.\"\n        )\n        assert (\n            schema[\"function\"][\"parameters\"][\"properties\"][\"y\"][\"description\"]\n            == \"A float parameter with type in docstring.\"\n        )\n        # Return type and description should be correct\n        assert schema[\"function\"][\"return\"][\"type\"] == \"number\"\n        assert schema[\"function\"][\"return\"][\"description\"] == \"The calculated result.\"\n\n    def test_mismatched_docstring_types(self, mismatched_types_func):\n        \"\"\"Test that type hints take precedence over docstring types when they conflict.\"\"\"\n        schema = get_json_schema(mismatched_types_func)\n        # Type hints should take precedence over docstring types\n        assert schema[\"function\"][\"parameters\"][\"properties\"][\"value\"][\"type\"] == \"integer\"\n        # Return type from type hint should be used, not docstring\n        assert schema[\"function\"][\"return\"][\"type\"] == \"string\"\n\n    def test_complex_docstring_types(self, complex_docstring_types_func):\n        \"\"\"Test parsing of complex type annotations in docstrings.\"\"\"\n        schema = get_json_schema(complex_docstring_types_func)\n        # Check that complex nested type is parsed correctly from type hints\n        data_prop = schema[\"function\"][\"parameters\"][\"properties\"][\"data\"]\n        assert data_prop[\"type\"] == \"object\"\n        # Check return type\n        return_prop = schema[\"function\"][\"return\"]\n        assert return_prop[\"type\"] == \"array\"\n        # Description should include the type information from docstring\n        assert data_prop[\"description\"] == \"Nested structure with types.\"\n        assert return_prop[\"description\"] == \"Processed results with types.\"\n\n    @pytest.mark.parametrize(\n        \"fixture_name,expected_description\",\n        [\n            (\"typed_docstring_func\", \"An integer parameter with type in docstring.\"),\n            (\"complex_docstring_types_func\", \"Nested structure with types.\"),\n        ],\n    )\n    def test_type_in_description_handling(self, request, fixture_name, expected_description):\n        \"\"\"Test that type information in docstrings is preserved in description.\"\"\"\n        func = request.getfixturevalue(fixture_name)\n        schema = get_json_schema(func)\n        # First parameter description should contain the expected text\n        first_param_name = list(schema[\"function\"][\"parameters\"][\"properties\"].keys())[0]\n        assert schema[\"function\"][\"parameters\"][\"properties\"][first_param_name][\"description\"] == expected_description\n\n    def test_with_special_words_in_description_func(self, keywords_in_description_func):\n        schema = get_json_schema(keywords_in_description_func)\n        assert schema[\"function\"][\"description\"] == \"Function with Args: or Returns: keywords in its description.\"\n\n\nclass TestGetCode:\n    @pytest.mark.parametrize(\n        \"code, expected\",\n        [\n            (\n                \"\"\"\n        import numpy\n        import pandas\n        \"\"\",\n                [\"numpy\", \"pandas\"],\n            ),\n            # From imports\n            (\n                \"\"\"\n        from torch import nn\n        from transformers import AutoModel\n        \"\"\",\n                [\"torch\", \"transformers\"],\n            ),\n            # Mixed case with nested imports\n            (\n                \"\"\"\n        import numpy as np\n        from torch.nn import Linear\n        import os.path\n        \"\"\",\n                [\"numpy\", \"torch\", \"os\"],\n            ),\n            # Try/except block (should be filtered)\n            (\n                \"\"\"\n        try:\n            import torch\n        except ImportError:\n            pass\n        import numpy\n        \"\"\",\n                [\"numpy\"],\n            ),\n            # Flash attention block (should be filtered)\n            (\n                \"\"\"\n        if is_flash_attn_2_available():\n            from flash_attn import flash_attn_func\n        import transformers\n        \"\"\",\n                [\"transformers\"],\n            ),\n            # Relative imports (should be excluded)\n            (\n                \"\"\"\n        from .utils import helper\n        from ..models import transformer\n        \"\"\",\n                [],\n            ),\n        ],\n    )\n    def test_get_imports(self, code: str, expected: list[str]):\n        assert sorted(get_imports(code)) == sorted(expected)\n"
  },
  {
    "path": "tests/test_gradio_ui.py",
    "content": "# coding=utf-8\n# Copyright 2024 HuggingFace Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\nimport os\nimport shutil\nimport tempfile\nimport unittest\nfrom unittest.mock import Mock, patch\n\nimport pytest\n\nfrom smolagents.agent_types import AgentAudio, AgentImage, AgentText\nfrom smolagents.gradio_ui import GradioUI, pull_messages_from_step, stream_to_gradio\nfrom smolagents.memory import ActionStep, FinalAnswerStep, PlanningStep, ToolCall\nfrom smolagents.models import ChatMessageStreamDelta\nfrom smolagents.monitoring import Timing, TokenUsage\n\n\nclass GradioUITester(unittest.TestCase):\n    def setUp(self):\n        \"\"\"Initialize test environment\"\"\"\n        self.temp_dir = tempfile.mkdtemp()\n        self.mock_agent = Mock()\n        self.ui = GradioUI(agent=self.mock_agent, file_upload_folder=self.temp_dir)\n        self.allowed_types = [\".pdf\", \".docx\", \".txt\"]\n\n    def tearDown(self):\n        \"\"\"Clean up test environment\"\"\"\n        shutil.rmtree(self.temp_dir)\n\n    def test_upload_file_default_types(self):\n        \"\"\"Test default allowed file types\"\"\"\n        default_types = [\".pdf\", \".docx\", \".txt\"]\n        for file_type in default_types:\n            with tempfile.NamedTemporaryFile(suffix=file_type) as temp_file:\n                mock_file = Mock()\n                mock_file.name = temp_file.name\n\n                textbox, uploads_log = self.ui.upload_file(mock_file, [])\n\n                self.assertIn(\"File uploaded:\", textbox.value)\n                self.assertEqual(len(uploads_log), 1)\n                self.assertTrue(os.path.exists(os.path.join(self.temp_dir, os.path.basename(temp_file.name))))\n\n    def test_upload_file_default_types_disallowed(self):\n        \"\"\"Test default disallowed file types\"\"\"\n        disallowed_types = [\".exe\", \".sh\", \".py\", \".jpg\"]\n        for file_type in disallowed_types:\n            with tempfile.NamedTemporaryFile(suffix=file_type) as temp_file:\n                mock_file = Mock()\n                mock_file.name = temp_file.name\n\n                textbox, uploads_log = self.ui.upload_file(mock_file, [])\n\n                self.assertEqual(textbox.value, \"File type disallowed\")\n                self.assertEqual(len(uploads_log), 0)\n\n    def test_upload_file_success(self):\n        \"\"\"Test successful file upload scenario\"\"\"\n        with tempfile.NamedTemporaryFile(suffix=\".txt\") as temp_file:\n            mock_file = Mock()\n            mock_file.name = temp_file.name\n\n            textbox, uploads_log = self.ui.upload_file(mock_file, [])\n\n            self.assertIn(\"File uploaded:\", textbox.value)\n            self.assertEqual(len(uploads_log), 1)\n            self.assertTrue(os.path.exists(os.path.join(self.temp_dir, os.path.basename(temp_file.name))))\n            self.assertEqual(uploads_log[0], os.path.join(self.temp_dir, os.path.basename(temp_file.name)))\n\n    def test_upload_file_none(self):\n        \"\"\"Test scenario when no file is selected\"\"\"\n        textbox, uploads_log = self.ui.upload_file(None, [])\n\n        self.assertEqual(textbox.value, \"No file uploaded\")\n        self.assertEqual(len(uploads_log), 0)\n\n    def test_upload_file_invalid_type(self):\n        \"\"\"Test disallowed file type\"\"\"\n        with tempfile.NamedTemporaryFile(suffix=\".exe\") as temp_file:\n            mock_file = Mock()\n            mock_file.name = temp_file.name\n\n            textbox, uploads_log = self.ui.upload_file(mock_file, [])\n\n            self.assertEqual(textbox.value, \"File type disallowed\")\n            self.assertEqual(len(uploads_log), 0)\n\n    def test_upload_file_special_chars(self):\n        \"\"\"Test scenario with special characters in filename\"\"\"\n        with tempfile.NamedTemporaryFile(suffix=\".txt\") as temp_file:\n            # Create a new temporary file with special characters\n            special_char_name = os.path.join(os.path.dirname(temp_file.name), \"test@#$%^&*.txt\")\n            shutil.copy(temp_file.name, special_char_name)\n            try:\n                mock_file = Mock()\n                mock_file.name = special_char_name\n\n                with patch(\"shutil.copy\"):\n                    textbox, uploads_log = self.ui.upload_file(mock_file, [])\n\n                    self.assertIn(\"File uploaded:\", textbox.value)\n                    self.assertEqual(len(uploads_log), 1)\n                    self.assertIn(\"test_____\", uploads_log[0])\n            finally:\n                # Clean up the special character file\n                if os.path.exists(special_char_name):\n                    os.remove(special_char_name)\n\n    def test_upload_file_custom_types(self):\n        \"\"\"Test custom allowed file types\"\"\"\n        with tempfile.NamedTemporaryFile(suffix=\".csv\") as temp_file:\n            mock_file = Mock()\n            mock_file.name = temp_file.name\n\n            textbox, uploads_log = self.ui.upload_file(mock_file, [], allowed_file_types=[\".csv\"])\n\n            self.assertIn(\"File uploaded:\", textbox.value)\n            self.assertEqual(len(uploads_log), 1)\n\n\nclass TestStreamToGradio:\n    \"\"\"Tests for the stream_to_gradio function.\"\"\"\n\n    @patch(\"smolagents.gradio_ui.pull_messages_from_step\")\n    def test_stream_to_gradio_memory_step(self, mock_pull_messages):\n        \"\"\"Test streaming a memory step\"\"\"\n        # Create mock agent and memory step\n        mock_agent = Mock()\n        mock_agent.run = Mock(return_value=[Mock(spec=ActionStep)])\n        mock_agent.model = Mock()\n        # Mock the pull_messages_from_step function to return some messages\n        mock_message = Mock()\n        mock_pull_messages.return_value = [mock_message]\n        # Call stream_to_gradio\n        result = list(stream_to_gradio(mock_agent, \"test task\"))\n        # Verify that pull_messages_from_step was called and the message was yielded\n        mock_pull_messages.assert_called_once()\n        assert result == [mock_message]\n\n    def test_stream_to_gradio_stream_delta(self):\n        \"\"\"Test streaming a ChatMessageStreamDelta\"\"\"\n        # Create mock agent and stream delta\n        mock_agent = Mock()\n        mock_delta = ChatMessageStreamDelta(content=\"Hello\")\n        mock_agent.run = Mock(return_value=[mock_delta])\n        mock_agent.model = Mock()\n        # Call stream_to_gradio\n        result = list(stream_to_gradio(mock_agent, \"test task\"))\n        # Verify that the content was yielded\n        assert result == [\"Hello\"]\n\n    def test_stream_to_gradio_multiple_deltas(self):\n        \"\"\"Test streaming multiple ChatMessageStreamDeltas\"\"\"\n        # Create mock agent and stream deltas\n        mock_agent = Mock()\n        mock_delta1 = ChatMessageStreamDelta(content=\"Hello\")\n        mock_delta2 = ChatMessageStreamDelta(content=\" world\")\n        mock_agent.run = Mock(return_value=[mock_delta1, mock_delta2])\n        mock_agent.model = Mock()\n        # Call stream_to_gradio\n        result = list(stream_to_gradio(mock_agent, \"test task\"))\n        # Verify that the content was accumulated and yielded\n        assert result == [\"Hello\", \"Hello world\"]\n\n    @pytest.mark.parametrize(\n        \"task,task_images,reset_memory,additional_args\",\n        [\n            (\"simple task\", None, False, None),\n            (\"task with images\", [\"image1.png\", \"image2.png\"], False, None),\n            (\"task with reset\", None, True, None),\n            (\"task with args\", None, False, {\"arg1\": \"value1\"}),\n            (\"complex task\", [\"image.png\"], True, {\"arg1\": \"value1\", \"arg2\": \"value2\"}),\n        ],\n    )\n    def test_stream_to_gradio_parameters(self, task, task_images, reset_memory, additional_args):\n        \"\"\"Test that stream_to_gradio passes parameters correctly to agent.run\"\"\"\n        # Create mock agent\n        mock_agent = Mock()\n        mock_agent.run = Mock(return_value=[])\n        # Call stream_to_gradio\n        list(\n            stream_to_gradio(\n                mock_agent,\n                task=task,\n                task_images=task_images,\n                reset_agent_memory=reset_memory,\n                additional_args=additional_args,\n            )\n        )\n        # Verify that agent.run was called with the right parameters\n        mock_agent.run.assert_called_once_with(\n            task, images=task_images, stream=True, reset=reset_memory, additional_args=additional_args\n        )\n\n\nclass TestPullMessagesFromStep:\n    def test_action_step_basic(\n        self,\n    ):\n        \"\"\"Test basic ActionStep processing.\"\"\"\n        step = ActionStep(\n            step_number=1,\n            model_output=\"This is the model output\",\n            observations=\"Some execution logs\",\n            error=None,\n            timing=Timing(start_time=1.0, end_time=3.5),\n            token_usage=TokenUsage(input_tokens=100, output_tokens=50),\n        )\n        messages = list(pull_messages_from_step(step))\n        assert len(messages) == 5  # step number, model_output, logs, footnote, divider\n        for message, expected_content in zip(\n            messages,\n            [\n                \"**Step 1**\",\n                \"This is the model output\",\n                \"execution logs\",\n                \"Input tokens: 100 | Output tokens: 50 | Duration: 2.5\",\n                \"-----\",\n            ],\n        ):\n            assert expected_content in message.content\n\n    def test_action_step_with_tool_calls(self):\n        \"\"\"Test ActionStep with tool calls.\"\"\"\n        step = ActionStep(\n            step_number=2,\n            tool_calls=[ToolCall(name=\"test_tool\", arguments={\"answer\": \"Test answer\"}, id=\"tool_call_1\")],\n            observations=\"Tool execution logs\",\n            timing=Timing(start_time=1.0, end_time=2.5),\n            token_usage=TokenUsage(input_tokens=100, output_tokens=50),\n        )\n        messages = list(pull_messages_from_step(step))\n        assert len(messages) == 5  # step, tool call, logs, footnote, divider\n        assert messages[1].content == \"Test answer\"\n        assert \"Used tool test_tool\" in messages[1].metadata[\"title\"]\n\n    @pytest.mark.parametrize(\n        \"tool_name, args, expected\",\n        [\n            (\"python_interpreter\", \"print('Hello')\", \"```python\\nprint('Hello')\\n```\"),\n            (\"regular_tool\", {\"key\": \"value\"}, \"{'key': 'value'}\"),\n            (\"string_args_tool\", \"simple string\", \"simple string\"),\n        ],\n    )\n    def test_action_step_tool_call_formats(self, tool_name, args, expected):\n        \"\"\"Test different formats of tool calls.\"\"\"\n        tool_call = Mock()\n        tool_call.name = tool_name\n        tool_call.arguments = args\n        step = ActionStep(\n            step_number=1,\n            tool_calls=[tool_call],\n            timing=Timing(start_time=1.0, end_time=2.5),\n            token_usage=TokenUsage(input_tokens=100, output_tokens=50),\n        )\n        messages = list(pull_messages_from_step(step))\n        tool_message = next(\n            msg\n            for msg in messages\n            if msg.role == \"assistant\" and msg.metadata and msg.metadata.get(\"title\", \"\").startswith(\"🛠️\")\n        )\n        assert expected in tool_message.content\n\n    def test_action_step_with_error(self):\n        \"\"\"Test ActionStep with error.\"\"\"\n        step = ActionStep(\n            step_number=3,\n            error=\"This is an error message\",\n            timing=Timing(start_time=1.0, end_time=2.0),\n            token_usage=TokenUsage(input_tokens=100, output_tokens=200),\n        )\n        messages = list(pull_messages_from_step(step))\n        error_message = next((m for m in messages if \"error\" in str(m.content).lower()), None)\n        assert error_message is not None\n        assert \"This is an error message\" in error_message.content\n\n    def test_action_step_with_images(self):\n        \"\"\"Test ActionStep with observation images.\"\"\"\n        step = ActionStep(\n            step_number=4,\n            observations_images=[\"image1.png\", \"image2.jpg\"],\n            token_usage=TokenUsage(input_tokens=100, output_tokens=200),\n            timing=Timing(start_time=1.0, end_time=2.0),\n        )\n        with patch(\"smolagents.gradio_ui.AgentImage\") as mock_agent_image:\n            mock_agent_image.return_value.to_string.side_effect = lambda: \"path/to/image.png\"\n            messages = list(pull_messages_from_step(step))\n            image_messages = [m for m in messages if \"image\" in str(m).lower()]\n            assert len(image_messages) == 2\n            assert \"path/to/image.png\" in str(image_messages[0])\n\n    @pytest.mark.parametrize(\n        \"skip_model_outputs, expected_messages_length, token_usage\",\n        [(False, 4, TokenUsage(input_tokens=80, output_tokens=30)), (True, 2, None)],\n    )\n    def test_planning_step(self, skip_model_outputs, expected_messages_length, token_usage):\n        \"\"\"Test PlanningStep processing.\"\"\"\n        step = PlanningStep(\n            plan=\"1. First step\\n2. Second step\",\n            model_input_messages=Mock(),\n            model_output_message=Mock(),\n            token_usage=token_usage,\n            timing=Timing(start_time=1.0, end_time=2.0),\n        )\n        messages = list(pull_messages_from_step(step, skip_model_outputs=skip_model_outputs))\n        assert len(messages) == expected_messages_length  # [header, plan,] footnote, divider\n        expected_contents = [\n            \"**Planning step**\",\n            \"1. First step\\n2. Second step\",\n            \"Input tokens: 80 | Output tokens: 30\" if token_usage else \"\",\n            \"-----\",\n        ]\n        for message, expected_content in zip(messages, expected_contents[-expected_messages_length:]):\n            assert expected_content in message.content\n\n        if not token_usage:\n            assert \"Input tokens: 80 | Output tokens: 30\" not in message.content\n\n    @pytest.mark.parametrize(\n        \"answer_type, answer_value, expected_content\",\n        [\n            (AgentText, \"This is a text answer\", \"**Final answer:**\\nThis is a text answer\\n\"),\n            (lambda: \"Plain string\", \"Plain string\", \"**Final answer:** Plain string\"),\n        ],\n    )\n    def test_final_answer_step(self, answer_type, answer_value, expected_content):\n        \"\"\"Test FinalAnswerStep with different answer types.\"\"\"\n        try:\n            final_answer = answer_type()\n        except TypeError:\n            with patch.object(answer_type, \"to_string\", return_value=answer_value):\n                final_answer = answer_type(answer_value)\n        step = FinalAnswerStep(\n            output=final_answer,\n        )\n        messages = list(pull_messages_from_step(step))\n        assert len(messages) == 1\n        assert messages[0].content == expected_content\n\n    def test_final_answer_step_image(self):\n        \"\"\"Test FinalAnswerStep with image answer.\"\"\"\n        with patch.object(AgentImage, \"to_string\", return_value=\"path/to/image.png\"):\n            step = FinalAnswerStep(output=AgentImage(\"path/to/image.png\"))\n            messages = list(pull_messages_from_step(step))\n            assert len(messages) == 1\n            assert messages[0].content[\"path\"] == \"path/to/image.png\"\n            assert messages[0].content[\"mime_type\"] == \"image/png\"\n\n    def test_final_answer_step_audio(self):\n        \"\"\"Test FinalAnswerStep with audio answer.\"\"\"\n        with patch.object(AgentAudio, \"to_string\", return_value=\"path/to/audio.wav\"):\n            step = FinalAnswerStep(output=AgentAudio(\"path/to/audio.wav\"))\n            messages = list(pull_messages_from_step(step))\n            assert len(messages) == 1\n            assert messages[0].content[\"path\"] == \"path/to/audio.wav\"\n            assert messages[0].content[\"mime_type\"] == \"audio/wav\"\n\n    def test_unsupported_step_type(self):\n        \"\"\"Test handling of unsupported step types.\"\"\"\n\n        class UnsupportedStep(Mock):\n            pass\n\n        step = UnsupportedStep()\n        with pytest.raises(ValueError, match=\"Unsupported step type\"):\n            list(pull_messages_from_step(step))\n"
  },
  {
    "path": "tests/test_import.py",
    "content": "import os\nimport subprocess\nimport tempfile\n\n\ndef test_import_smolagents_without_extras(monkeypatch):\n    monkeypatch.delenv(\"VIRTUAL_ENV\", raising=False)\n    with tempfile.TemporaryDirectory() as temp_dir:\n        # Create a virtual environment\n        venv_dir = os.path.join(temp_dir, \"venv\")\n        subprocess.run([\"uv\", \"venv\", venv_dir], check=True)\n\n        # Install smolagents in the virtual environment\n        subprocess.run(\n            [\"uv\", \"pip\", \"install\", \"--python\", os.path.join(venv_dir, \"bin\", \"python\"), \"smolagents @ .\"], check=True\n        )\n\n        # Run the import test in the virtual environment\n        result = subprocess.run(\n            [os.path.join(venv_dir, \"bin\", \"python\"), \"-c\", \"import smolagents\"],\n            capture_output=True,\n            text=True,\n        )\n\n    # Check if the import was successful\n    assert result.returncode == 0, (\n        \"Import failed with error: \"\n        + (result.stderr.splitlines()[-1] if result.stderr else \"No error message\")\n        + \"\\n\"\n        + result.stderr\n    )\n"
  },
  {
    "path": "tests/test_local_python_executor.py",
    "content": "# coding=utf-8\n# Copyright 2024 HuggingFace Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\nimport ast\nimport time\nimport types\nfrom contextlib import nullcontext as does_not_raise\nfrom textwrap import dedent\nfrom unittest.mock import patch\n\nimport numpy as np\nimport pandas as pd\nimport pytest\n\nfrom smolagents.default_tools import BASE_PYTHON_TOOLS, FinalAnswerTool\nfrom smolagents.local_python_executor import (\n    DANGEROUS_FUNCTIONS,\n    DANGEROUS_MODULES,\n    ExecutionTimeoutError,\n    InterpreterError,\n    LocalPythonExecutor,\n    PrintContainer,\n    check_import_authorized,\n    evaluate_boolop,\n    evaluate_condition,\n    evaluate_delete,\n    evaluate_python_code,\n    evaluate_subscript,\n    fix_final_answer_code,\n    get_safe_module,\n    timeout,\n)\n\n\n# Fake function we will use as tool\ndef add_two(x):\n    return x + 2\n\n\nclass TestEvaluatePythonCode:\n    def assertDictEqualNoPrint(self, dict1, dict2):\n        assert {k: v for k, v in dict1.items() if k != \"_print_outputs\"} == {\n            k: v for k, v in dict2.items() if k != \"_print_outputs\"\n        }\n\n    def test_evaluate_assign(self):\n        code = \"x = 3\"\n        state = {}\n        result, _ = evaluate_python_code(code, {}, state=state)\n        assert result == 3\n        self.assertDictEqualNoPrint(state, {\"x\": 3, \"_operations_count\": {\"counter\": 2}})\n\n        code = \"x = y\"\n        state = {\"y\": 5}\n        result, _ = evaluate_python_code(code, {}, state=state)\n        # evaluate returns the value of the last assignment.\n        assert result == 5\n        self.assertDictEqualNoPrint(state, {\"x\": 5, \"y\": 5, \"_operations_count\": {\"counter\": 2}})\n\n        code = \"a=1;b=None\"\n        result, _ = evaluate_python_code(code, {}, state={})\n        # evaluate returns the value of the last assignment.\n        assert result is None\n\n    def test_assignment_cannot_overwrite_tool(self):\n        code = \"print = '3'\"\n        with pytest.raises(InterpreterError) as e:\n            evaluate_python_code(code, {\"print\": print}, state={})\n        assert \"Cannot assign to name 'print': doing this would erase the existing tool!\" in str(e)\n\n    def test_subscript_call(self):\n        code = \"\"\"def foo(x,y):return x*y\\n\\ndef boo(y):\\n\\treturn y**3\\nfun = [foo, boo]\\nresult_foo = fun[0](4,2)\\nresult_boo = fun[1](4)\"\"\"\n        state = {}\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state=state)\n        assert result == 64\n        assert state[\"result_foo\"] == 8\n        assert state[\"result_boo\"] == 64\n\n    def test_evaluate_call(self):\n        code = \"y = add_two(x)\"\n        state = {\"x\": 3}\n        result, _ = evaluate_python_code(code, {\"add_two\": add_two}, state=state)\n        assert result == 5\n        self.assertDictEqualNoPrint(state, {\"x\": 3, \"y\": 5, \"_operations_count\": {\"counter\": 3}})\n\n        # Should not work without the tool\n        with pytest.raises(InterpreterError, match=\"Forbidden function evaluation: 'add_two'\"):\n            evaluate_python_code(code, {}, state=state)\n\n    @pytest.mark.parametrize(\n        \"code, expected_result\",\n        [\n            # Basic **kwargs unpacking\n            (\n                \"\"\"\ndef test_func(a, b=10, **kwargs):\n    return a + b + sum(kwargs.values())\n\nkwargs_dict = {'x': 5, 'y': 15}\ntest_func(1, **kwargs_dict)\n\"\"\",\n                31,  # 1 + 10 + 5 + 15\n            ),\n            # **kwargs with regular kwargs\n            (\n                \"\"\"\ndef test_func(a, **kwargs):\n    return a + sum(kwargs.values())\n\nkwargs_dict = {'x': 5, 'y': 15}\ntest_func(1, b=20, **kwargs_dict)\n\"\"\",\n                41,  # 1 + 20 + 5 + 15\n            ),\n            # Multiple **kwargs unpacking\n            (\n                \"\"\"\ndef test_func(**kwargs):\n    return sum(kwargs.values())\n\ndict1 = {'a': 1, 'b': 2}\ndict2 = {'c': 3, 'd': 4}\ntest_func(**dict1, **dict2)\n\"\"\",\n                10,  # 1 + 2 + 3 + 4\n            ),\n            # **kwargs with positional args\n            (\n                \"\"\"\ndef test_func(x, y, **kwargs):\n    return x * y + sum(kwargs.values())\n\nparams = {'factor': 2, 'offset': 5}\ntest_func(3, 4, **params)\n\"\"\",\n                19,  # 3 * 4 + 2 + 5\n            ),\n            # Empty **kwargs dict\n            (\n                \"\"\"\ndef test_func(a, **kwargs):\n    return a + len(kwargs)\n\nempty_dict = {}\ntest_func(10, **empty_dict)\n\"\"\",\n                10,  # 10 + 0\n            ),\n        ],\n    )\n    def test_evaluate_call_starred_kwargs(self, code, expected_result):\n        result, _ = evaluate_python_code(code, {\"sum\": sum, \"len\": len}, state={})\n        assert result == expected_result\n\n    @pytest.mark.parametrize(\n        \"code, expected_error_message\",\n        [\n            # Non-dict value in **kwargs\n            (\n                \"\"\"\ndef test_func(**kwargs):\n    return sum(kwargs.values())\n\nnot_a_dict = [1, 2, 3]\ntest_func(**not_a_dict)\n\"\"\",\n                \"Cannot unpack non-dict value in **kwargs: list\",\n            ),\n            # **kwargs with non-dict variable\n            (\n                \"\"\"\ndef test_func(**kwargs):\n    return kwargs\n\ntest_func(**42)\n\"\"\",\n                \"Cannot unpack non-dict value in **kwargs: int\",\n            ),\n            # **kwargs with None\n            (\n                \"\"\"\ndef test_func(**kwargs):\n    return kwargs\n\ntest_func(**None)\n\"\"\",\n                \"Cannot unpack non-dict value in **kwargs: NoneType\",\n            ),\n        ],\n    )\n    def test_evaluate_call_starred_kwargs_errors(self, code, expected_error_message):\n        \"\"\"Test that **kwargs unpacking raises appropriate errors for non-dict values.\"\"\"\n        with pytest.raises(InterpreterError) as exception_info:\n            evaluate_python_code(code, {\"sum\": sum}, state={})\n        assert expected_error_message in str(exception_info.value)\n\n    def test_evaluate_class_def(self):\n        code = dedent('''\\\n            class MyClass:\n                \"\"\"A class with a value.\"\"\"\n\n                def __init__(self, value):\n                    self.value = value\n\n                def get_value(self):\n                    return self.value\n\n            instance = MyClass(42)\n            result = instance.get_value()\n        ''')\n        state = {}\n        result, _ = evaluate_python_code(code, {}, state=state)\n        assert result == 42\n        assert state[\"instance\"].__doc__ == \"A class with a value.\"\n\n    def test_evaluate_class_def_with_assign_attribute_target(self):\n        \"\"\"\n        Test evaluate_class_def function when stmt is an instance of ast.Assign with ast.Attribute target.\n        \"\"\"\n        code = dedent(\"\"\"\n        class TestSubClass:\n            attr1 = 1\n        class TestClass:\n            data = TestSubClass()\n            data.attr1 = \"value1\"\n            data.attr2 = \"value2\"\n        result = (TestClass.data.attr1, TestClass.data.attr2)\n        \"\"\")\n\n        state = {}\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state=state)\n\n        assert result == (\"value1\", \"value2\")\n        assert isinstance(state[\"TestClass\"], type)\n        assert state[\"TestClass\"].data.attr1 == \"value1\"\n        assert state[\"TestClass\"].data.attr2 == \"value2\"\n\n    def test_evaluate_constant(self):\n        code = \"x = 3\"\n        state = {}\n        result, _ = evaluate_python_code(code, {}, state=state)\n        assert result == 3\n        self.assertDictEqualNoPrint(state, {\"x\": 3, \"_operations_count\": {\"counter\": 2}})\n\n    def test_evaluate_dict(self):\n        code = \"test_dict = {'x': x, 'y': add_two(x)}\"\n        state = {\"x\": 3}\n        result, _ = evaluate_python_code(code, {\"add_two\": add_two}, state=state)\n        assert result == {\"x\": 3, \"y\": 5}\n        self.assertDictEqualNoPrint(\n            state, {\"x\": 3, \"test_dict\": {\"x\": 3, \"y\": 5}, \"_operations_count\": {\"counter\": 7}}\n        )\n\n    def test_evaluate_expression(self):\n        code = \"x = 3\\ny = 5\"\n        state = {}\n        result, _ = evaluate_python_code(code, {}, state=state)\n        # evaluate returns the value of the last assignment.\n        assert result == 5\n        self.assertDictEqualNoPrint(state, {\"x\": 3, \"y\": 5, \"_operations_count\": {\"counter\": 4}})\n\n    def test_evaluate_f_string(self):\n        code = \"text = f'This is x: {x}.'\"\n        state = {\"x\": 3}\n        result, _ = evaluate_python_code(code, {}, state=state)\n        # evaluate returns the value of the last assignment.\n        assert result == \"This is x: 3.\"\n        self.assertDictEqualNoPrint(state, {\"x\": 3, \"text\": \"This is x: 3.\", \"_operations_count\": {\"counter\": 6}})\n\n    def test_evaluate_f_string_with_format(self):\n        code = \"text = f'This is x: {x:.2f}.'\"\n        state = {\"x\": 3.336}\n        result, _ = evaluate_python_code(code, {}, state=state)\n        assert result == \"This is x: 3.34.\"\n        self.assertDictEqualNoPrint(\n            state, {\"x\": 3.336, \"text\": \"This is x: 3.34.\", \"_operations_count\": {\"counter\": 8}}\n        )\n\n    def test_evaluate_f_string_with_complex_format(self):\n        code = \"text = f'This is x: {x:>{width}.{precision}f}.'\"\n        state = {\"x\": 3.336, \"width\": 10, \"precision\": 2}\n        result, _ = evaluate_python_code(code, {}, state=state)\n        assert result == \"This is x:       3.34.\"\n        self.assertDictEqualNoPrint(\n            state,\n            {\n                \"x\": 3.336,\n                \"width\": 10,\n                \"precision\": 2,\n                \"text\": \"This is x:       3.34.\",\n                \"_operations_count\": {\"counter\": 14},\n            },\n        )\n\n    def test_evaluate_if(self):\n        code = \"if x <= 3:\\n    y = 2\\nelse:\\n    y = 5\"\n        state = {\"x\": 3}\n        result, _ = evaluate_python_code(code, {}, state=state)\n        # evaluate returns the value of the last assignment.\n        assert result == 2\n        self.assertDictEqualNoPrint(state, {\"x\": 3, \"y\": 2, \"_operations_count\": {\"counter\": 6}})\n\n        state = {\"x\": 8}\n        result, _ = evaluate_python_code(code, {}, state=state)\n        # evaluate returns the value of the last assignment.\n        assert result == 5\n        self.assertDictEqualNoPrint(state, {\"x\": 8, \"y\": 5, \"_operations_count\": {\"counter\": 6}})\n\n    def test_evaluate_list(self):\n        code = \"test_list = [x, add_two(x)]\"\n        state = {\"x\": 3}\n        result, _ = evaluate_python_code(code, {\"add_two\": add_two}, state=state)\n        assert result == [3, 5]\n        self.assertDictEqualNoPrint(state, {\"x\": 3, \"test_list\": [3, 5], \"_operations_count\": {\"counter\": 5}})\n\n    def test_evaluate_name(self):\n        code = \"y = x\"\n        state = {\"x\": 3}\n        result, _ = evaluate_python_code(code, {}, state=state)\n        assert result == 3\n        self.assertDictEqualNoPrint(state, {\"x\": 3, \"y\": 3, \"_operations_count\": {\"counter\": 2}})\n\n    def test_evaluate_subscript(self):\n        code = \"test_list = [x, add_two(x)]\\ntest_list[1]\"\n        state = {\"x\": 3}\n        result, _ = evaluate_python_code(code, {\"add_two\": add_two}, state=state)\n        assert result == 5\n        self.assertDictEqualNoPrint(state, {\"x\": 3, \"test_list\": [3, 5], \"_operations_count\": {\"counter\": 9}})\n\n        code = \"test_dict = {'x': x, 'y': add_two(x)}\\ntest_dict['y']\"\n        state = {\"x\": 3}\n        result, _ = evaluate_python_code(code, {\"add_two\": add_two}, state=state)\n        assert result == 5\n        self.assertDictEqualNoPrint(\n            state, {\"x\": 3, \"test_dict\": {\"x\": 3, \"y\": 5}, \"_operations_count\": {\"counter\": 11}}\n        )\n\n        code = \"vendor = {'revenue': 31000, 'rent': 50312}; vendor['ratio'] = round(vendor['revenue'] / vendor['rent'], 2)\"\n        state = {}\n        evaluate_python_code(code, {\"min\": min, \"print\": print, \"round\": round}, state=state)\n        assert state[\"vendor\"] == {\"revenue\": 31000, \"rent\": 50312, \"ratio\": 0.62}\n\n    def test_subscript_string_with_string_index_raises_appropriate_error(self):\n        code = \"\"\"\nsearch_results = \"[{'title': 'Paris, Ville de Paris, France Weather Forecast | AccuWeather', 'href': 'https://www.accuweather.com/en/fr/paris/623/weather-forecast/623', 'body': 'Get the latest weather forecast for Paris, Ville de Paris, France , including hourly, daily, and 10-day outlooks. AccuWeather provides you with reliable and accurate information on temperature ...'}]\"\nfor result in search_results:\n    if 'current' in result['title'].lower() or 'temperature' in result['title'].lower():\n        current_weather_url = result['href']\n        print(current_weather_url)\n        break\"\"\"\n        with pytest.raises(InterpreterError) as e:\n            evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n            assert \"You're trying to subscript a string with a string index\" in e\n\n    def test_evaluate_for(self):\n        code = \"x = 0\\nfor i in range(3):\\n    x = i\"\n        state = {}\n        result, _ = evaluate_python_code(code, {\"range\": range}, state=state)\n        assert result == 2\n        self.assertDictEqualNoPrint(state, {\"x\": 2, \"i\": 2, \"_operations_count\": {\"counter\": 11}})\n\n    def test_evaluate_binop(self):\n        code = \"y + x\"\n        state = {\"x\": 3, \"y\": 6}\n        result, _ = evaluate_python_code(code, {}, state=state)\n        assert result == 9\n        self.assertDictEqualNoPrint(state, {\"x\": 3, \"y\": 6, \"_operations_count\": {\"counter\": 4}})\n\n    def test_recursive_function(self):\n        code = \"\"\"\ndef recur_fibo(n):\n    if n <= 1:\n        return n\n    else:\n        return(recur_fibo(n-1) + recur_fibo(n-2))\nrecur_fibo(6)\"\"\"\n        result, _ = evaluate_python_code(code, {}, state={})\n        assert result == 8\n\n    def test_max_operations(self):\n        # Check that operation counter is not reset in functions\n        code = dedent(\n            \"\"\"\n            def func(a):\n                for j in range(10):\n                    a += j\n                return a\n\n            for i in range(5):\n                func(i)\n            \"\"\"\n        )\n        with patch(\"smolagents.local_python_executor.MAX_OPERATIONS\", 100):\n            with pytest.raises(InterpreterError) as exception_info:\n                evaluate_python_code(code, {\"range\": range}, state={})\n        assert \"Reached the max number of operations\" in str(exception_info.value)\n\n    def test_operations_count(self):\n        # Check that operation counter is not reset in functions\n        code = dedent(\n            \"\"\"\n            def func():\n                return 0\n\n            func()\n            \"\"\"\n        )\n        state = {}\n        evaluate_python_code(code, {\"range\": range}, state=state)\n        assert state[\"_operations_count\"][\"counter\"] == 5\n\n    def test_evaluate_string_methods(self):\n        code = \"'hello'.replace('h', 'o').split('e')\"\n        result, _ = evaluate_python_code(code, {}, state={})\n        assert result == [\"o\", \"llo\"]\n\n    def test_evaluate_slicing(self):\n        code = \"'hello'[1:3][::-1]\"\n        result, _ = evaluate_python_code(code, {}, state={})\n        assert result == \"le\"\n\n    def test_access_attributes(self):\n        class A:\n            attr = 2\n\n        code = \"A.attr\"\n        result, _ = evaluate_python_code(code, {}, state={\"A\": A})\n        assert result == 2\n\n    def test_list_comprehension(self):\n        code = \"sentence = 'THESEAGULL43'\\nmeaningful_sentence = '-'.join([char.lower() for char in sentence if char.isalpha()])\"\n        result, _ = evaluate_python_code(code, {}, state={})\n        assert result == \"t-h-e-s-e-a-g-u-l-l\"\n\n    def test_string_indexing(self):\n        code = \"\"\"text_block = [\n    \"THESE\",\n    \"AGULL\"\n]\nsentence = \"\"\nfor block in text_block:\n    for col in range(len(text_block[0])):\n        sentence += block[col]\n        \"\"\"\n        result, _ = evaluate_python_code(code, {\"len\": len, \"range\": range}, state={})\n        assert result == \"THESEAGULL\"\n\n    def test_tuples(self):\n        code = \"x = (1, 2, 3)\\nx[1]\"\n        result, _ = evaluate_python_code(code, {}, state={})\n        assert result == 2\n\n        code = \"\"\"\ndigits, i = [1, 2, 3], 1\ndigits[i], digits[i + 1] = digits[i + 1], digits[i]\"\"\"\n        evaluate_python_code(code, {\"range\": range, \"print\": print, \"int\": int}, {})\n\n        code = \"\"\"\ndef calculate_isbn_10_check_digit(number):\n    total = sum((10 - i) * int(digit) for i, digit in enumerate(number))\n    remainder = total % 11\n    check_digit = 11 - remainder\n    if check_digit == 10:\n        return 'X'\n    elif check_digit == 11:\n        return '0'\n    else:\n        return str(check_digit)\n\n# Given 9-digit numbers\nnumbers = [\n    \"478225952\",\n    \"643485613\",\n    \"739394228\",\n    \"291726859\",\n    \"875262394\",\n    \"542617795\",\n    \"031810713\",\n    \"957007669\",\n    \"871467426\"\n]\n\n# Calculate check digits for each number\ncheck_digits = [calculate_isbn_10_check_digit(number) for number in numbers]\nprint(check_digits)\n\"\"\"\n        state = {}\n        evaluate_python_code(\n            code,\n            {\n                \"range\": range,\n                \"print\": print,\n                \"sum\": sum,\n                \"enumerate\": enumerate,\n                \"int\": int,\n                \"str\": str,\n            },\n            state,\n        )\n\n    def test_dictcomp(self):\n        code = \"x = {i: i**2 for i in range(3)}\"\n        result, _ = evaluate_python_code(code, {\"range\": range}, state={})\n        assert result == {0: 0, 1: 1, 2: 4}\n\n        code = \"{num: name for num, name in {101: 'a', 102: 'b'}.items() if name not in ['a']}\"\n        result, _ = evaluate_python_code(code, {\"print\": print}, state={}, authorized_imports=[\"pandas\"])\n        assert result == {102: \"b\"}\n\n        code = \"\"\"\nshifts = {'A': ('6:45', '8:00'), 'B': ('10:00', '11:45')}\nshift_minutes = {worker: ('a', 'b') for worker, (start, end) in shifts.items()}\n\"\"\"\n        result, _ = evaluate_python_code(code, {}, state={})\n        assert result == {\"A\": (\"a\", \"b\"), \"B\": (\"a\", \"b\")}\n\n    def test_dictcomp_nested(self):\n        code = \"\"\"\nsimple_map = {\n    (x, y): f\"key_{x}_{y}\"\n    for x in [1, 2]\n    for y in ['a', 'b']\n}\n\"\"\"\n        result, _ = evaluate_python_code(code, {}, state={})\n        assert result == {(1, \"a\"): \"key_1_a\", (1, \"b\"): \"key_1_b\", (2, \"a\"): \"key_2_a\", (2, \"b\"): \"key_2_b\"}\n\n    def test_listcomp(self):\n        code = \"x = [i for i in range(3)]\"\n        result, _ = evaluate_python_code(code, {\"range\": range}, state={})\n        assert result == [0, 1, 2]\n\n    def test_listcomp_nested(self):\n        code = \"\"\"\nsimple_list = [\n    (x, y)\n    for x in [1, 2, 1]\n    for y in ['a', 'b']\n]\n\"\"\"\n        result, _ = evaluate_python_code(code, {}, state={})\n        assert result == [(1, \"a\"), (1, \"b\"), (2, \"a\"), (2, \"b\"), (1, \"a\"), (1, \"b\")]\n\n    def test_setcomp(self):\n        code = \"batman_times = {entry['time'] for entry in [{'time': 10}, {'time': 19}, {'time': 20}]}\"\n        result, _ = evaluate_python_code(code, {}, state={})\n        assert result == {10, 19, 20}\n\n    def test_setcomp_nested(self):\n        code = \"\"\"\nsimple_set = {\n    (x, y)\n    for x in [1, 2, 1]\n    for y in ['a', 'b']\n}\n\"\"\"\n        result, _ = evaluate_python_code(code, {}, state={})\n        assert result == {(1, \"a\"), (1, \"b\"), (2, \"a\"), (2, \"b\")}\n\n    def test_generatorexp(self):\n        code = \"x = (i for i in range(3))\"\n        result, _ = evaluate_python_code(code, {\"range\": range}, state={})\n        # assert not isinstance(result, list)\n        assert isinstance(result, types.GeneratorType)\n        assert list(result) == [0, 1, 2]\n\n    def test_generatorexp_with_infinite_sequence(self):\n        \"\"\"Test that generator expressions handle infinite sequences correctly without hanging.\"\"\"\n        code = dedent(\n            \"\"\"\\\n            import itertools\n\n            def infinite_counter():\n                return itertools.count()\n\n            # Create a generator expression that filters an infinite sequence\n            even_numbers = (x for x in infinite_counter() if x % 2 == 0)\n\n            # Get just the first 3 even numbers\n            first_three = []\n            gen_iter = iter(even_numbers)\n            for _ in range(3):\n                first_three.append(next(gen_iter))\n\n            result = first_three\n            \"\"\"\n        )\n\n        state = {}\n        result, _ = evaluate_python_code(code, {\"int\": int, \"iter\": iter, \"next\": next, \"range\": range}, state=state)\n\n        # Verify we got the expected values\n        assert result == [0, 2, 4]\n\n        # Verify it's actually a generator\n        even_numbers = state[\"even_numbers\"]\n        assert isinstance(even_numbers, types.GeneratorType)\n\n        # If this were a list, the code would hang indefinitely trying to\n        # evaluate the entire infinite sequence upfront\n\n    def test_break(self):\n        code = \"for i in range(10):\\n    if i == 5:\\n        break\\ni\"\n        result, _ = evaluate_python_code(code, {\"range\": range}, state={})\n        assert result == 5\n\n    def test_pass(self):\n        code = \"for i in range(10):\\n    if i == 5:\\n        pass\\ni\"\n        result, _ = evaluate_python_code(code, {\"range\": range}, state={})\n        assert result == 9\n\n    def test_continue(self):\n        code = \"cnt = 0\\nfor i in range(10):\\n    continue\\n    cnt += 1\\ncnt\"\n        result, _ = evaluate_python_code(code, {\"range\": range}, state={})\n        assert result == 0\n\n        code = \"cnt = 0\\nfor i in range(3):\\n    if i == 1:\\n        continue\\n    cnt += 1\\ncnt\"\n        result, _ = evaluate_python_code(code, {\"range\": range}, state={})\n        assert result == 2\n\n    def test_call_int(self):\n        code = \"import math\\nstr(math.ceil(149))\"\n        result, _ = evaluate_python_code(code, {\"str\": lambda x: str(x)}, state={})\n        assert result == \"149\"\n\n    def test_lambda(self):\n        code = \"f = lambda x: x + 2\\nf(3)\"\n        result, _ = evaluate_python_code(code, {}, state={})\n        assert result == 5\n\n    def test_tuple_assignment(self):\n        code = \"a, b = 0, 1\\nb\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n        assert result == 1\n\n    def test_while(self):\n        code = \"i = 0\\nwhile i < 3:\\n    i += 1\\ni\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n        assert result == 3\n\n        # test infinite loop\n        code = \"i = 0\\nwhile i < 3:\\n    i -= 1\\ni\"\n        with patch(\"smolagents.local_python_executor.MAX_WHILE_ITERATIONS\", 100):\n            with pytest.raises(InterpreterError, match=\".*Maximum number of 100 iterations in While loop exceeded\"):\n                evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n\n        # test lazy evaluation\n        code = dedent(\n            \"\"\"\n            house_positions = [0, 7, 10, 15, 18, 22, 22]\n            i, n, loc = 0, 7, 30\n            while i < n and house_positions[i] <= loc:\n                i += 1\n            \"\"\"\n        )\n        state = {}\n        evaluate_python_code(code, BASE_PYTHON_TOOLS, state=state)\n\n    def test_generator(self):\n        code = \"a = [1, 2, 3, 4, 5]; b = (i**2 for i in a); list(b)\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n        assert result == [1, 4, 9, 16, 25]\n\n    def test_boolops(self):\n        code = \"\"\"if (not (a > b and a > c)) or d > e:\n    best_city = \"Brooklyn\"\nelse:\n    best_city = \"Manhattan\"\n    best_city\n    \"\"\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={\"a\": 1, \"b\": 2, \"c\": 3, \"d\": 4, \"e\": 5})\n        assert result == \"Brooklyn\"\n\n        code = \"\"\"if d > e and a < b:\n    best_city = \"Brooklyn\"\nelif d < e and a < b:\n    best_city = \"Sacramento\"\nelse:\n    best_city = \"Manhattan\"\n    best_city\n    \"\"\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={\"a\": 1, \"b\": 2, \"c\": 3, \"d\": 4, \"e\": 5})\n        assert result == \"Sacramento\"\n\n        # Short-circuit evaluation:\n        # (T and 0) or (T and T) => 0 or True => True\n        code = \"result = (x > 3 and y) or (z == 10 and not y)\\nresult\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={\"x\": 5, \"y\": 0, \"z\": 10})\n        assert result\n\n        # (None or \"\") or \"Found\" => \"\" or \"Found\" => \"Found\"\n        code = \"result = (a or c) or b\\nresult\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={\"a\": None, \"b\": \"Found\", \"c\": \"\"})\n        assert result == \"Found\"\n\n        # (\"First\" and \"\") or \"Third\" => \"\" or \"Third\" -> \"Third\"\n        code = \"result = (a and b) or c\\nresult\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={\"a\": \"First\", \"b\": \"\", \"c\": \"Third\"})\n        assert result == \"Third\"\n\n    def test_if_conditions(self):\n        code = \"\"\"char='a'\nif char.isalpha():\n    print('2')\"\"\"\n        state = {}\n        evaluate_python_code(code, BASE_PYTHON_TOOLS, state=state)\n        assert state[\"_print_outputs\"].value == \"2\\n\"\n\n    def test_imports(self):\n        code = \"import math\\nmath.sqrt(4)\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n        assert result == 2.0\n\n        code = \"from random import choice, seed\\nseed(12)\\nchoice(['win', 'lose', 'draw'])\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n        assert result == \"lose\"\n\n        code = \"import time, re\\ntime.sleep(0.1)\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n        assert result is None\n\n        code = \"from queue import Queue\\nq = Queue()\\nq.put(1)\\nq.get()\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n        assert result == 1\n\n        code = \"import itertools\\nlist(itertools.islice(range(10), 3))\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n        assert result == [0, 1, 2]\n\n        code = \"import re\\nre.search('a', 'abc').group()\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n        assert result == \"a\"\n\n        code = \"import stat\\nstat.S_ISREG(0o100644)\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n        assert result\n\n        code = \"import statistics\\nstatistics.mean([1, 2, 3, 4, 4])\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n        assert result == 2.8\n\n        code = \"import unicodedata\\nunicodedata.name('A')\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n        assert result == \"LATIN CAPITAL LETTER A\"\n\n        # Test submodules are handled properly, thus not raising error\n        code = \"import numpy.random as rd\\nrng = rd.default_rng(12345)\\nrng.random()\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={}, authorized_imports=[\"numpy.random\"])\n\n        code = \"from numpy.random import default_rng as d_rng\\nrng = d_rng(12345)\\nrng.random()\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={}, authorized_imports=[\"numpy.random\"])\n\n    def test_additional_imports(self):\n        code = \"import numpy as np\"\n        evaluate_python_code(code, authorized_imports=[\"numpy\"], state={})\n\n        # Test that allowing 'numpy.*' allows numpy root package and its submodules\n        code = \"import numpy as np\\nnp.random.default_rng(123)\\nnp.array([1, 2])\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={}, authorized_imports=[\"numpy.*\"])\n\n        # Test that allowing 'numpy.*' allows importing a submodule\n        code = \"import numpy.random as rd\\nrd.default_rng(12345)\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={}, authorized_imports=[\"numpy.*\"])\n\n        code = \"import numpy.random as rd\"\n        evaluate_python_code(code, authorized_imports=[\"numpy.random\"], state={})\n        evaluate_python_code(code, authorized_imports=[\"numpy.*\"], state={})\n        evaluate_python_code(code, authorized_imports=[\"*\"], state={})\n        with pytest.raises(InterpreterError):\n            evaluate_python_code(code, authorized_imports=[\"random\"], state={})\n\n        with pytest.raises(InterpreterError):\n            evaluate_python_code(code, authorized_imports=[\"numpy.a\"], state={})\n        with pytest.raises(InterpreterError):\n            evaluate_python_code(code, authorized_imports=[\"numpy.a.*\"], state={})\n\n    def test_multiple_comparators(self):\n        code = \"0 <= -1 < 4 and 0 <= -5 < 4\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n        assert not result\n\n        code = \"0 <= 1 < 4 and 0 <= -5 < 4\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n        assert not result\n\n        code = \"0 <= 4 < 4 and 0 <= 3 < 4\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n        assert not result\n\n        code = \"0 <= 3 < 4 and 0 <= 3 < 4\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n        assert result\n\n    def test_print_output(self):\n        code = \"print('Hello world!')\\nprint('Ok no one cares')\"\n        state = {}\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state=state)\n        assert result is None\n        assert state[\"_print_outputs\"].value == \"Hello world!\\nOk no one cares\\n\"\n\n        # Test print in function (state copy)\n        code = \"\"\"\nprint(\"1\")\ndef function():\n    print(\"2\")\nfunction()\"\"\"\n        state = {}\n        evaluate_python_code(code, {\"print\": print}, state=state)\n        assert state[\"_print_outputs\"].value == \"1\\n2\\n\"\n\n        # Test print in list comprehension (state copy)\n        code = \"\"\"\nprint(\"1\")\ndef function():\n    print(\"2\")\n[function() for i in range(10)]\"\"\"\n        state = {}\n        evaluate_python_code(code, {\"print\": print, \"range\": range}, state=state)\n        assert state[\"_print_outputs\"].value == \"1\\n2\\n2\\n2\\n2\\n2\\n2\\n2\\n2\\n2\\n2\\n\"\n\n    def test_tuple_target_in_iterator(self):\n        code = \"for a, b in [('Ralf Weikert', 'Austria'), ('Samuel Seungwon Lee', 'South Korea')]:res = a.split()[0]\"\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n        assert result == \"Samuel\"\n\n    def test_classes(self):\n        code = \"\"\"\nclass Animal:\n    species = \"Generic Animal\"\n\n    def __init__(self, name, age):\n        self.name = name\n        self.age = age\n\n    def sound(self):\n        return \"The animal makes a sound.\"\n\n    def __str__(self):\n        return f\"{self.name}, {self.age} years old\"\n\nclass Dog(Animal):\n    species = \"Canine\"\n\n    def __init__(self, name, age, breed):\n        super().__init__(name, age)\n        self.breed = breed\n\n    def sound(self):\n        return \"The dog barks.\"\n\n    def __str__(self):\n        return f\"{self.name}, {self.age} years old, {self.breed}\"\n\nclass Cat(Animal):\n    def sound(self):\n        return \"The cat meows.\"\n\n    def __str__(self):\n        return f\"{self.name}, {self.age} years old, {self.species}\"\n\n\n# Testing multiple instances\ndog1 = Dog(\"Fido\", 3, \"Labrador\")\ndog2 = Dog(\"Buddy\", 5, \"Golden Retriever\")\n\n# Testing method with built-in function\nanimals = [dog1, dog2, Cat(\"Whiskers\", 2)]\nnum_animals = len(animals)\n\n# Testing exceptions in methods\nclass ExceptionTest:\n    def method_that_raises(self):\n        raise ValueError(\"An error occurred\")\n\ntry:\n    exc_test = ExceptionTest()\n    exc_test.method_that_raises()\nexcept ValueError as e:\n    exception_message = str(e)\n\n\n# Collecting results\ndog1_sound = dog1.sound()\ndog1_str = str(dog1)\ndog2_sound = dog2.sound()\ndog2_str = str(dog2)\ncat = Cat(\"Whiskers\", 2)\ncat_sound = cat.sound()\ncat_str = str(cat)\n    \"\"\"\n        state = {}\n        evaluate_python_code(\n            code,\n            {\"print\": print, \"len\": len, \"super\": super, \"str\": str, \"sum\": sum},\n            state=state,\n        )\n\n        # Assert results\n        assert state[\"dog1_sound\"] == \"The dog barks.\"\n        assert state[\"dog1_str\"] == \"Fido, 3 years old, Labrador\"\n        assert state[\"dog2_sound\"] == \"The dog barks.\"\n        assert state[\"dog2_str\"] == \"Buddy, 5 years old, Golden Retriever\"\n        assert state[\"cat_sound\"] == \"The cat meows.\"\n        assert state[\"cat_str\"] == \"Whiskers, 2 years old, Generic Animal\"\n        assert state[\"num_animals\"] == 3\n        assert state[\"exception_message\"] == \"An error occurred\"\n\n    def test_variable_args(self):\n        code = \"\"\"\ndef var_args_method(self, *args, **kwargs):\n    return sum(args) + sum(kwargs.values())\n\nvar_args_method(1, 2, 3, x=4, y=5)\n\"\"\"\n        state = {}\n        result, _ = evaluate_python_code(code, {\"sum\": sum}, state=state)\n        assert result == 15\n\n    def test_exceptions(self):\n        code = \"\"\"\ndef method_that_raises(self):\n    raise ValueError(\"An error occurred\")\n\ntry:\n    method_that_raises()\nexcept ValueError as e:\n    exception_message = str(e)\n    \"\"\"\n        state = {}\n        evaluate_python_code(\n            code,\n            {\"print\": print, \"len\": len, \"super\": super, \"str\": str, \"sum\": sum},\n            state=state,\n        )\n        assert state[\"exception_message\"] == \"An error occurred\"\n\n    def test_print(self):\n        code = \"print(min([1, 2, 3]))\"\n        state = {}\n        evaluate_python_code(code, {\"min\": min, \"print\": print}, state=state)\n        assert state[\"_print_outputs\"].value == \"1\\n\"\n\n    def test_types_as_objects(self):\n        code = \"type_a = float(2); type_b = str; type_c = int\"\n        state = {}\n        result, is_final_answer = evaluate_python_code(code, {\"float\": float, \"str\": str, \"int\": int}, state=state)\n        # Type objects are not wrapped by safer_func\n        assert not hasattr(result, \"__wrapped__\")\n        assert result is int\n\n    def test_tuple_id(self):\n        code = \"\"\"\nfood_items = {\"apple\": 2, \"banana\": 3, \"orange\": 1, \"pear\": 1}\nunique_food_items = [item for item, count in food_item_counts.items() if count == 1]\n\"\"\"\n        state = {}\n        result, is_final_answer = evaluate_python_code(code, {}, state=state)\n        assert result == [\"orange\", \"pear\"]\n\n    def test_nonsimple_augassign(self):\n        code = \"\"\"\ncounts_dict = {'a': 0}\ncounts_dict['a'] += 1\ncounts_list = [1, 2, 3]\ncounts_list += [4, 5, 6]\n\nclass Counter:\n    def __init__(self):\n        self.count = 0\n\na = Counter()\na.count += 1\n\"\"\"\n        state = {}\n        evaluate_python_code(code, {}, state=state)\n        assert state[\"counts_dict\"] == {\"a\": 1}\n        assert state[\"counts_list\"] == [1, 2, 3, 4, 5, 6]\n        assert state[\"a\"].count == 1\n\n    def test_adding_int_to_list_raises_error(self):\n        code = \"\"\"\ncounts = [1, 2, 3]\ncounts += 1\"\"\"\n        with pytest.raises(InterpreterError) as e:\n            evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n        assert \"Cannot add non-list value 1 to a list.\" in str(e)\n\n    def test_error_highlights_correct_line_of_code(self):\n        code = \"\"\"a = 1\nb = 2\n\ncounts = [1, 2, 3]\ncounts += 1\nb += 1\"\"\"\n        with pytest.raises(InterpreterError) as e:\n            evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n        assert \"Code execution failed at line 'counts += 1\" in str(e)\n\n    def test_error_type_returned_in_function_call(self):\n        code = \"\"\"def error_function():\n    raise ValueError(\"error\")\n\nerror_function()\"\"\"\n        with pytest.raises(InterpreterError) as e:\n            evaluate_python_code(code)\n        assert \"error\" in str(e)\n        assert \"ValueError\" in str(e)\n\n    def test_assert(self):\n        code = \"\"\"\nassert 1 == 1\nassert 1 == 2\n\"\"\"\n        with pytest.raises(InterpreterError) as e:\n            evaluate_python_code(code, BASE_PYTHON_TOOLS, state={})\n        assert \"1 == 2\" in str(e) and \"1 == 1\" not in str(e)\n\n    def test_with_context_manager(self):\n        code = \"\"\"\nclass SimpleLock:\n    def __init__(self):\n        self.locked = False\n\n    def __enter__(self):\n        self.locked = True\n        return self\n\n    def __exit__(self, exc_type, exc_value, traceback):\n        self.locked = False\n\nlock = SimpleLock()\n\nwith lock as l:\n    assert l.locked == True\n\nassert lock.locked == False\n    \"\"\"\n        state = {}\n        tools = {}\n        evaluate_python_code(code, tools, state=state)\n\n    def test_with_context_manager_enter_returns_different_object(self):\n        \"\"\"Test that __exit__ is called on the context manager, not the __enter__ return value.\"\"\"\n        code = \"\"\"\nclass MyContextManager:\n    def __init__(self):\n        self.entered = False\n        self.exited = False\n\n    def __enter__(self):\n        self.entered = True\n        return \"I am NOT the context manager\"\n\n    def __exit__(self, exc_type, exc_val, exc_tb):\n        self.exited = True\n        return False\n\ncm = MyContextManager()\nwith cm as val:\n    assert val == \"I am NOT the context manager\"\n    assert cm.entered == True\n\nassert cm.exited == True\n    \"\"\"\n        evaluate_python_code(code, {}, state={})\n\n    def test_with_context_manager_no_as_clause_exit_called(self):\n        \"\"\"Test that __exit__ is called on context managers used without 'as' clause.\"\"\"\n        code = \"\"\"\nclass MyContextManager:\n    def __init__(self):\n        self.exited = False\n\n    def __enter__(self):\n        return \"not self\"\n\n    def __exit__(self, exc_type, exc_val, exc_tb):\n        self.exited = True\n        return False\n\ncm = MyContextManager()\nwith cm:\n    pass\n\nassert cm.exited == True\n    \"\"\"\n        evaluate_python_code(code, {}, state={})\n\n    def test_with_exception_suppressed_by_exit(self):\n        \"\"\"Test that __exit__ returning True suppresses the exception.\"\"\"\n        code = \"\"\"\nclass Suppressor:\n    def __init__(self):\n        self.exit_called = False\n    def __enter__(self):\n        return self\n    def __exit__(self, exc_type, exc_val, exc_tb):\n        self.exit_called = True\n        return True  # suppress\n\ncm = Suppressor()\nwith cm:\n    raise ValueError(\"should be suppressed\")\n\nassert cm.exit_called == True\n        \"\"\"\n        evaluate_python_code(code, {}, state={})\n\n    def test_with_exception_not_suppressed_by_exit(self):\n        \"\"\"Test that __exit__ returning False re-raises the original exception.\"\"\"\n        code = \"\"\"\nclass NonSuppressor:\n    def __enter__(self):\n        return self\n    def __exit__(self, exc_type, exc_val, exc_tb):\n        return False\n\nwith NonSuppressor():\n    raise ValueError(\"should propagate\")\n        \"\"\"\n        with pytest.raises(ValueError, match=\"should propagate\"):\n            evaluate_python_code(code, {}, state={})\n\n    def test_with_multiple_cms_inner_suppresses_outer_sees_no_exception(self):\n        \"\"\"Test that when the inner CM suppresses, the outer CM's __exit__ gets (None, None, None).\"\"\"\n        code = \"\"\"\ncalls = []\n\nclass Outer:\n    def __enter__(self): return self\n    def __exit__(self, exc_type, exc_val, exc_tb):\n        calls.append((\"outer\", exc_type))\n        return False\n\nclass Inner:\n    def __enter__(self): return self\n    def __exit__(self, exc_type, exc_val, exc_tb):\n        calls.append((\"inner\", exc_type))\n        return True  # suppress\n\nwith Outer(), Inner():\n    raise ValueError(\"suppressed by inner\")\n\nassert calls[0] == (\"inner\", ValueError), calls\nassert calls[1] == (\"outer\", None), calls\n        \"\"\"\n        evaluate_python_code(code, {}, state={})\n\n    def test_with_multiple_cms_neither_suppresses(self):\n        \"\"\"Test that when no CM suppresses, the original exception propagates.\"\"\"\n        code = \"\"\"\ncalls = []\n\nclass Recorder:\n    def __init__(self, name):\n        self.name = name\n    def __enter__(self): return self\n    def __exit__(self, exc_type, exc_val, exc_tb):\n        calls.append((self.name, exc_type))\n        return False\n\nwith Recorder(\"outer\"), Recorder(\"inner\"):\n    raise RuntimeError(\"should propagate\")\n        \"\"\"\n        with pytest.raises(InterpreterError, match=\"should propagate\"):\n            evaluate_python_code(code, {}, state={})\n\n    def test_with_multiple_cms_outer_suppresses(self):\n        \"\"\"Test that the outer CM can suppress after the inner CM does not.\"\"\"\n        code = \"\"\"\ncalls = []\n\nclass Outer:\n    def __enter__(self): return self\n    def __exit__(self, exc_type, exc_val, exc_tb):\n        calls.append((\"outer\", exc_type))\n        return True  # suppress\n\nclass Inner:\n    def __enter__(self): return self\n    def __exit__(self, exc_type, exc_val, exc_tb):\n        calls.append((\"inner\", exc_type))\n        return False  # don't suppress\n\nwith Outer(), Inner():\n    raise ValueError(\"suppressed by outer\")\n\nassert calls[0] == (\"inner\", ValueError), calls\nassert calls[1] == (\"outer\", ValueError), calls\n        \"\"\"\n        evaluate_python_code(code, {}, state={})\n\n    def test_with_exit_raising_replaces_original_exception(self):\n        \"\"\"Test that an exception raised inside __exit__ replaces the original.\"\"\"\n        code = \"\"\"\nclass RaisingExit:\n    def __enter__(self): return self\n    def __exit__(self, exc_type, exc_val, exc_tb):\n        raise RuntimeError(\"from __exit__\")\n\nwith RaisingExit():\n    raise ValueError(\"original\")\n        \"\"\"\n        with pytest.raises(InterpreterError, match=\"from __exit__\"):\n            evaluate_python_code(code, {}, state={})\n\n    def test_default_arg_in_function(self):\n        code = \"\"\"\ndef f(a, b=333, n=1000):\n    return b + n\nn = f(1, n=667)\n\"\"\"\n        res, is_final_answer = evaluate_python_code(code, {}, {})\n        assert res == 1000\n        assert not is_final_answer\n\n    def test_set(self):\n        code = \"\"\"\nS1 = {'a', 'b', 'c'}\nS2 = {'b', 'c', 'd'}\nS3 = S1.difference(S2)\nS4 = S1.intersection(S2)\n\"\"\"\n        state = {}\n        evaluate_python_code(code, {}, state=state)\n        assert state[\"S3\"] == {\"a\"}\n        assert state[\"S4\"] == {\"b\", \"c\"}\n\n    def test_return(self):\n        # test early returns\n        code = \"\"\"\ndef add_one(n, shift):\n    if True:\n        return n + shift\n    return n\n\nadd_one(1, 1)\n\"\"\"\n        state = {}\n        result, is_final_answer = evaluate_python_code(\n            code, {\"print\": print, \"range\": range, \"ord\": ord, \"chr\": chr}, state=state\n        )\n        assert result == 2\n\n        # test returning None\n        code = \"\"\"\ndef returns_none(a):\n    return\n\nreturns_none(1)\n\"\"\"\n        state = {}\n        result, is_final_answer = evaluate_python_code(\n            code, {\"print\": print, \"range\": range, \"ord\": ord, \"chr\": chr}, state=state\n        )\n        assert result is None\n\n    def test_nested_for_loop(self):\n        code = \"\"\"\nall_res = []\nfor i in range(10):\n    subres = []\n    for j in range(i):\n        subres.append(j)\n    all_res.append(subres)\n\nout = [i for sublist in all_res for i in sublist]\nout[:10]\n\"\"\"\n        state = {}\n        result, is_final_answer = evaluate_python_code(code, {\"print\": print, \"range\": range}, state=state)\n        assert result == [0, 0, 1, 0, 1, 2, 0, 1, 2, 3]\n\n    def test_pandas(self):\n        code = \"\"\"\nimport pandas as pd\n\ndf = pd.DataFrame.from_dict({'SetCount': ['5', '4', '5'], 'Quantity': [1, 0, -1]})\n\ndf['SetCount'] = pd.to_numeric(df['SetCount'], errors='coerce')\n\nparts_with_5_set_count = df[df['SetCount'] == 5.0]\nparts_with_5_set_count[['Quantity', 'SetCount']].values[1]\n\"\"\"\n        state = {}\n        result, _ = evaluate_python_code(code, {}, state=state, authorized_imports=[\"pandas\"])\n        assert np.array_equal(result, [-1, 5])\n\n        code = \"\"\"\nimport pandas as pd\n\ndf = pd.DataFrame.from_dict({\"AtomicNumber\": [111, 104, 105], \"ok\": [0, 1, 2]})\n\n# Filter the DataFrame to get only the rows with outdated atomic numbers\nfiltered_df = df.loc[df['AtomicNumber'].isin([104])]\n\"\"\"\n        result, _ = evaluate_python_code(code, {\"print\": print}, state={}, authorized_imports=[\"pandas\"])\n        assert np.array_equal(result.values[0], [104, 1])\n\n        # Test groupby\n        code = \"\"\"import pandas as pd\ndata = pd.DataFrame.from_dict([\n    {\"Pclass\": 1, \"Survived\": 1},\n    {\"Pclass\": 2, \"Survived\": 0},\n    {\"Pclass\": 2, \"Survived\": 1}\n])\nsurvival_rate_by_class = data.groupby('Pclass')['Survived'].mean()\n\"\"\"\n        result, _ = evaluate_python_code(code, {}, state={}, authorized_imports=[\"pandas\"])\n        assert result.values[1] == 0.5\n\n        # Test loc and iloc\n        code = \"\"\"import pandas as pd\ndata = pd.DataFrame.from_dict([\n    {\"Pclass\": 1, \"Survived\": 1},\n    {\"Pclass\": 2, \"Survived\": 0},\n    {\"Pclass\": 2, \"Survived\": 1}\n])\nsurvival_rate_biased = data.loc[data['Survived']==1]['Survived'].mean()\nsurvival_rate_biased = data.loc[data['Survived']==1]['Survived'].mean()\nsurvival_rate_sorted = data.sort_values(by='Survived', ascending=False).iloc[0]\n\"\"\"\n        result, _ = evaluate_python_code(code, {}, state={}, authorized_imports=[\"pandas\"])\n\n    def test_starred(self):\n        code = \"\"\"\nfrom math import radians, sin, cos, sqrt, atan2\n\ndef haversine(lat1, lon1, lat2, lon2):\n    R = 6371000  # Radius of the Earth in meters\n    lat1, lon1, lat2, lon2 = map(radians, [lat1, lon1, lat2, lon2])\n    dlat = lat2 - lat1\n    dlon = lon2 - lon1\n    a = sin(dlat / 2) ** 2 + cos(lat1) * cos(lat2) * sin(dlon / 2) ** 2\n    c = 2 * atan2(sqrt(a), sqrt(1 - a))\n    distance = R * c\n    return distance\n\ncoords_geneva = (46.1978, 6.1342)\ncoords_barcelona = (41.3869, 2.1660)\n\ndistance_geneva_barcelona = haversine(*coords_geneva, *coords_barcelona)\n\"\"\"\n        result, _ = evaluate_python_code(code, {\"print\": print, \"map\": map}, state={}, authorized_imports=[\"math\"])\n        assert round(result, 1) == 622395.4\n\n    def test_for(self):\n        code = \"\"\"\nshifts = {\n    \"Worker A\": (\"6:45 pm\", \"8:00 pm\"),\n    \"Worker B\": (\"10:00 am\", \"11:45 am\")\n}\n\nshift_intervals = {}\nfor worker, (start, end) in shifts.items():\n    shift_intervals[worker] = end\nshift_intervals\n\"\"\"\n        result, _ = evaluate_python_code(code, {\"print\": print, \"map\": map}, state={})\n        assert result == {\"Worker A\": \"8:00 pm\", \"Worker B\": \"11:45 am\"}\n\n    def test_syntax_error_points_error(self):\n        code = \"a = ;\"\n        with pytest.raises(InterpreterError) as e:\n            evaluate_python_code(code)\n        assert \"SyntaxError\" in str(e)\n        assert \"     ^\" in str(e)\n\n    def test_close_matches_subscript(self):\n        code = 'capitals = {\"Czech Republic\": \"Prague\", \"Monaco\": \"Monaco\", \"Bhutan\": \"Thimphu\"};capitals[\"Butan\"]'\n        with pytest.raises(Exception) as e:\n            evaluate_python_code(code)\n        assert \"Maybe you meant one of these indexes instead\" in str(e) and \"['Bhutan']\" in str(e).replace(\"\\\\\", \"\")\n\n    def test_dangerous_builtins_calls_are_blocked(self):\n        unsafe_code = \"import os\"\n        dangerous_code = f\"\"\"\nexec = callable.__self__.exec\ncompile = callable.__self__.compile\nexec(compile('{unsafe_code}', 'no filename', 'exec'))\n\"\"\"\n\n        with pytest.raises(InterpreterError):\n            evaluate_python_code(unsafe_code, static_tools=BASE_PYTHON_TOOLS)\n\n        with pytest.raises(InterpreterError):\n            evaluate_python_code(dangerous_code, static_tools=BASE_PYTHON_TOOLS)\n\n    def test_final_answer_accepts_kwarg_answer(self):\n        code = \"final_answer(answer=2)\"\n        result, _ = evaluate_python_code(code, {\"final_answer\": (lambda answer: 2 * answer)}, state={})\n        assert result == 4\n\n    def test_final_answer_not_caught_by_except_exception(self):\n        \"\"\"Test that final_answer is not caught by generic 'except Exception' clauses.\n\n        This test reproduces the issue from GitHub issue #1905 where agent-generated\n        code with try/except Exception blocks would incorrectly catch FinalAnswerException.\n        \"\"\"\n        code = dedent(\"\"\"\n            try:\n                final_answer(1)\n            except Exception as e:\n                final_answer(2)\n        \"\"\")\n        result, is_final_answer = evaluate_python_code(code, {\"final_answer\": (lambda answer: answer)}, state={})\n        # The result should be 1 (from the first final_answer call),\n        # not 2 (which would happen if FinalAnswerException was caught)\n        assert result == 1\n        assert is_final_answer is True\n\n    def test_dangerous_builtins_are_callable_if_explicitly_added(self):\n        dangerous_code = dedent(\"\"\"\n            eval(\"1 + 1\")\n            exec(compile(\"1 + 1\", \"no filename\", \"exec\"))\n        \"\"\")\n        evaluate_python_code(\n            dangerous_code, static_tools={\"compile\": compile, \"eval\": eval, \"exec\": exec} | BASE_PYTHON_TOOLS\n        )\n\n    def test_can_import_os_if_explicitly_authorized(self):\n        dangerous_code = \"import os; os.listdir('./')\"\n        evaluate_python_code(dangerous_code, authorized_imports=[\"os\"])\n\n    def test_can_import_os_if_all_imports_authorized(self):\n        dangerous_code = \"import os; os.listdir('./')\"\n        evaluate_python_code(dangerous_code, authorized_imports=[\"*\"])\n\n    @pytest.mark.filterwarnings(\"ignore::DeprecationWarning\")\n    def test_can_import_scipy_if_explicitly_authorized(self):\n        code = \"import scipy\"\n        evaluate_python_code(code, authorized_imports=[\"scipy\"])\n\n    @pytest.mark.filterwarnings(\"ignore::DeprecationWarning\")\n    def test_can_import_sklearn_if_explicitly_authorized(self):\n        code = \"import sklearn\"\n        evaluate_python_code(code, authorized_imports=[\"sklearn\"])\n\n    def test_function_def_recovers_source_code(self):\n        executor = LocalPythonExecutor([])\n\n        executor.send_tools({\"final_answer\": FinalAnswerTool()})\n\n        res = executor(\n            dedent(\n                \"\"\"\n                def target_function():\n                    return \"Hello world\"\n\n                final_answer(target_function)\n                \"\"\"\n            )\n        ).output\n        assert res.__name__ == \"target_function\"\n        assert res.__source__ == \"def target_function():\\n    return 'Hello world'\"\n\n    def test_evaluate_class_def_with_pass(self):\n        code = dedent(\"\"\"\n            class TestClass:\n                pass\n\n            instance = TestClass()\n            instance.attr = \"value\"\n            result = instance.attr\n        \"\"\")\n        state = {}\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state=state)\n        assert result == \"value\"\n\n    def test_evaluate_class_def_with_ann_assign_name(self):\n        \"\"\"\n        Test evaluate_class_def function when stmt is an instance of ast.AnnAssign with ast.Name target.\n\n        This test verifies that annotated assignments within a class definition are correctly evaluated.\n        \"\"\"\n        code = dedent(\"\"\"\n            class TestClass:\n                x: int = 5\n                y: str = \"test\"\n\n            instance = TestClass()\n            result = (instance.x, instance.y)\n        \"\"\")\n\n        state = {}\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state=state)\n\n        assert result == (5, \"test\")\n        assert isinstance(state[\"TestClass\"], type)\n        # Type objects are not wrapped by safer_func\n        for value in state[\"TestClass\"].__annotations__.values():\n            assert not hasattr(value, \"__wrapped__\")\n        assert state[\"TestClass\"].__annotations__ == {\"x\": int, \"y\": str}\n        assert state[\"TestClass\"].x == 5\n        assert state[\"TestClass\"].y == \"test\"\n        assert isinstance(state[\"instance\"], state[\"TestClass\"])\n        assert state[\"instance\"].x == 5\n        assert state[\"instance\"].y == \"test\"\n\n    def test_evaluate_class_def_with_ann_assign_attribute(self):\n        \"\"\"\n        Test evaluate_class_def function when stmt is an instance of ast.AnnAssign with ast.Attribute target.\n\n        This test ensures that class attributes using attribute notation are correctly handled.\n        \"\"\"\n        code = dedent(\"\"\"\n        class TestSubClass:\n            attr = 1\n        class TestClass:\n            data: TestSubClass = TestSubClass()\n            data.attr: str = \"value\"\n\n        result = TestClass.data.attr\n        \"\"\")\n\n        state = {}\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state=state)\n\n        assert result == \"value\"\n        assert isinstance(state[\"TestClass\"], type)\n        assert state[\"TestClass\"].__annotations__.keys() == {\"data\"}\n        assert isinstance(state[\"TestClass\"].__annotations__[\"data\"], type)\n        assert state[\"TestClass\"].__annotations__[\"data\"].__name__ == \"TestSubClass\"\n        assert state[\"TestClass\"].data.attr == \"value\"\n\n    def test_evaluate_class_def_with_ann_assign_subscript(self):\n        \"\"\"\n        Test evaluate_class_def function when stmt is an instance of ast.AnnAssign with ast.Subscript target.\n\n        This test ensures that class attributes using subscript notation are correctly handled.\n        \"\"\"\n        code = dedent(\"\"\"\n        class TestClass:\n            key_data: dict = {}\n            key_data[\"key\"]: str = \"value\"\n            index_data: list = [10, 20, 30]\n            index_data[0:2]: list[str] = [\"a\", \"b\"]\n\n        result = (TestClass.key_data['key'], TestClass.index_data[1:])\n        \"\"\")\n\n        state = {}\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state=state)\n\n        assert result == (\"value\", [\"b\", 30])\n        assert isinstance(state[\"TestClass\"], type)\n        # Type objects are not wrapped by safer_func\n        for value in state[\"TestClass\"].__annotations__.values():\n            assert not hasattr(value, \"__wrapped__\")\n        assert state[\"TestClass\"].__annotations__ == {\"key_data\": dict, \"index_data\": list}\n        assert state[\"TestClass\"].key_data == {\"key\": \"value\"}\n        assert state[\"TestClass\"].index_data == [\"a\", \"b\", 30]\n\n    def test_evaluate_class_def_with_enum(self):\n        \"\"\"\n        Test evaluate_class_def function with Enum classes.\n\n        This test ensures that Enum classes are correctly handled by using the\n        appropriate metaclass and __prepare__ method.\n        \"\"\"\n        code = dedent(\"\"\"\n        from enum import Enum\n\n        class Status(Enum):\n            SUCCESS = \"Success\"\n            FAILURE = \"Failure\"\n            PENDING = \"Pending\"\n            ERROR = \"Error\"\n\n        status_value = Status.SUCCESS.value\n        status_name = Status.SUCCESS.name\n        \"\"\")\n\n        state = {}\n        result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state=state, authorized_imports=[\"enum\"])\n\n        assert state[\"status_value\"] == \"Success\"\n        assert state[\"status_name\"] == \"SUCCESS\"\n        assert isinstance(state[\"Status\"], type)\n        assert hasattr(state[\"Status\"], \"SUCCESS\")\n        assert state[\"Status\"].SUCCESS.value == \"Success\"\n        assert state[\"Status\"].FAILURE.value == \"Failure\"\n        assert state[\"Status\"].PENDING.value == \"Pending\"\n        assert state[\"Status\"].ERROR.value == \"Error\"\n\n    def test_evaluate_annassign(self):\n        code = dedent(\"\"\"\\\n            # Basic annotated assignment\n            x: int = 42\n\n            # Type annotations with expressions\n            y: float = x / 2\n\n            # Type annotation without assignment\n            z: list\n\n            # Type annotation with complex value\n            names: list = [\"Alice\", \"Bob\", \"Charlie\"]\n\n            # Type hint shouldn't restrict values at runtime\n            s: str = 123  # Would be a type error in static checking, but valid at runtime\n\n            # Access the values\n            result = (x, y, names, s)\n        \"\"\")\n        state = {}\n        evaluate_python_code(code, BASE_PYTHON_TOOLS, state=state)\n        assert state[\"x\"] == 42\n        assert state[\"y\"] == 21.0\n        assert \"z\" not in state  # z should be not be defined\n        assert state[\"names\"] == [\"Alice\", \"Bob\", \"Charlie\"]\n        assert state[\"s\"] == 123  # Type hints don't restrict at runtime\n        assert state[\"result\"] == (42, 21.0, [\"Alice\", \"Bob\", \"Charlie\"], 123)\n\n    @pytest.mark.parametrize(\n        \"code, expected_result\",\n        [\n            (\n                dedent(\"\"\"\\\n                    x = 1\n                    x += 2\n                \"\"\"),\n                3,\n            ),\n            (\n                dedent(\"\"\"\\\n                    x = \"a\"\n                    x += \"b\"\n                \"\"\"),\n                \"ab\",\n            ),\n            (\n                dedent(\"\"\"\\\n                    class Custom:\n                        def __init__(self, value):\n                            self.value = value\n                        def __iadd__(self, other):\n                            self.value += other * 10\n                            return self\n\n                    x = Custom(1)\n                    x += 2\n                    x.value\n                \"\"\"),\n                21,\n            ),\n        ],\n    )\n    def test_evaluate_augassign(self, code, expected_result):\n        state = {}\n        result, _ = evaluate_python_code(code, {}, state=state)\n        assert result == expected_result\n\n    @pytest.mark.parametrize(\n        \"operator, expected_result\",\n        [\n            (\"+=\", 7),\n            (\"-=\", 3),\n            (\"*=\", 10),\n            (\"/=\", 2.5),\n            (\"//=\", 2),\n            (\"%=\", 1),\n            (\"**=\", 25),\n            (\"&=\", 0),\n            (\"|=\", 7),\n            (\"^=\", 7),\n            (\">>=\", 1),\n            (\"<<=\", 20),\n        ],\n    )\n    def test_evaluate_augassign_number(self, operator, expected_result):\n        code = dedent(\"\"\"\\\n            x = 5\n            x {operator} 2\n        \"\"\").format(operator=operator)\n        state = {}\n        result, _ = evaluate_python_code(code, {}, state=state)\n        assert result == expected_result\n\n    @pytest.mark.parametrize(\n        \"operator, expected_result\",\n        [\n            (\"+=\", 7),\n            (\"-=\", 3),\n            (\"*=\", 10),\n            (\"/=\", 2.5),\n            (\"//=\", 2),\n            (\"%=\", 1),\n            (\"**=\", 25),\n            (\"&=\", 0),\n            (\"|=\", 7),\n            (\"^=\", 7),\n            (\">>=\", 1),\n            (\"<<=\", 20),\n        ],\n    )\n    def test_evaluate_augassign_custom(self, operator, expected_result):\n        operator_names = {\n            \"+=\": \"iadd\",\n            \"-=\": \"isub\",\n            \"*=\": \"imul\",\n            \"/=\": \"itruediv\",\n            \"//=\": \"ifloordiv\",\n            \"%=\": \"imod\",\n            \"**=\": \"ipow\",\n            \"&=\": \"iand\",\n            \"|=\": \"ior\",\n            \"^=\": \"ixor\",\n            \">>=\": \"irshift\",\n            \"<<=\": \"ilshift\",\n        }\n        code = dedent(\"\"\"\\\n            class Custom:\n                def __init__(self, value):\n                    self.value = value\n                def __{operator_name}__(self, other):\n                    self.value {operator} other\n                    return self\n\n            x = Custom(5)\n            x {operator} 2\n            x.value\n        \"\"\").format(operator=operator, operator_name=operator_names[operator])\n        state = {}\n        result, _ = evaluate_python_code(code, {}, state=state)\n        assert result == expected_result\n\n    @pytest.mark.parametrize(\n        \"code, expected_error_message\",\n        [\n            (\n                dedent(\"\"\"\\\n                    x = 5\n                    del x\n                    x\n                \"\"\"),\n                \"The variable `x` is not defined\",\n            ),\n            (\n                dedent(\"\"\"\\\n                    x = [1, 2, 3]\n                    del x[2]\n                    x[2]\n                \"\"\"),\n                \"IndexError: list index out of range\",\n            ),\n            (\n                dedent(\"\"\"\\\n                    x = {\"key\": \"value\"}\n                    del x[\"key\"]\n                    x[\"key\"]\n                \"\"\"),\n                \"Could not index {} with 'key'\",\n            ),\n            (\n                dedent(\"\"\"\\\n                    del x\n                \"\"\"),\n                \"Cannot delete name 'x': name is not defined\",\n            ),\n        ],\n    )\n    def test_evaluate_delete(self, code, expected_error_message):\n        state = {}\n        with pytest.raises(InterpreterError) as exception_info:\n            evaluate_python_code(code, {}, state=state)\n        assert expected_error_message in str(exception_info.value)\n\n    def test_non_standard_comparisons(self):\n        code = dedent(\"\"\"\\\n            class NonStdEqualsResult:\n                def __init__(self, left:object, right:object):\n                    self._left = left\n                    self._right = right\n                def __str__(self) -> str:\n                    return f'{self._left} == {self._right}'\n\n            class NonStdComparisonClass:\n                def __init__(self, value: str ):\n                    self._value = value\n                def __str__(self):\n                    return self._value\n                def __eq__(self, other):\n                    return NonStdEqualsResult(self, other)\n            a = NonStdComparisonClass(\"a\")\n            b = NonStdComparisonClass(\"b\")\n            result = a == b\n            \"\"\")\n        result, _ = evaluate_python_code(code, state={})\n        assert not isinstance(result, bool)\n        assert str(result) == \"a == b\"\n\n\nclass TestEvaluateBoolop:\n    @pytest.mark.parametrize(\"a\", [1, 0])\n    @pytest.mark.parametrize(\"b\", [2, 0])\n    @pytest.mark.parametrize(\"c\", [3, 0])\n    def test_evaluate_boolop_and(self, a, b, c):\n        boolop_ast = ast.parse(\"a and b and c\").body[0].value\n        state = {\"a\": a, \"b\": b, \"c\": c}\n        result = evaluate_boolop(boolop_ast, state, {}, {}, [])\n        assert result == (a and b and c)\n\n    @pytest.mark.parametrize(\"a\", [1, 0])\n    @pytest.mark.parametrize(\"b\", [2, 0])\n    @pytest.mark.parametrize(\"c\", [3, 0])\n    def test_evaluate_boolop_or(self, a, b, c):\n        boolop_ast = ast.parse(\"a or b or c\").body[0].value\n        state = {\"a\": a, \"b\": b, \"c\": c}\n        result = evaluate_boolop(boolop_ast, state, {}, {}, [])\n        assert result == (a or b or c)\n\n\nclass TestEvaluateDelete:\n    @pytest.mark.parametrize(\n        \"code, state, expectation\",\n        [\n            (\"del x\", {\"x\": 1}, {}),\n            (\"del x[1]\", {\"x\": [1, 2, 3]}, {\"x\": [1, 3]}),\n            (\"del x['key']\", {\"x\": {\"key\": \"value\"}}, {\"x\": {}}),\n            (\"del x\", {}, InterpreterError(\"Cannot delete name 'x': name is not defined\")),\n        ],\n    )\n    def test_evaluate_delete(self, code, state, expectation):\n        delete_node = ast.parse(code).body[0]\n        if isinstance(expectation, Exception):\n            with pytest.raises(type(expectation)) as exception_info:\n                evaluate_delete(delete_node, state, {}, {}, [])\n            assert str(expectation) in str(exception_info.value)\n        else:\n            evaluate_delete(delete_node, state, {}, {}, [])\n            _ = state.pop(\"_operations_count\", None)\n            assert state == expectation\n\n\nclass TestEvaluateCondition:\n    @pytest.mark.parametrize(\n        \"condition, state, expected_result\",\n        [\n            (\"a == b\", {\"a\": 1, \"b\": 1}, True),\n            (\"a == b\", {\"a\": 1, \"b\": 2}, False),\n            (\"a != b\", {\"a\": 1, \"b\": 1}, False),\n            (\"a != b\", {\"a\": 1, \"b\": 2}, True),\n            (\"a < b\", {\"a\": 1, \"b\": 1}, False),\n            (\"a < b\", {\"a\": 1, \"b\": 2}, True),\n            (\"a < b\", {\"a\": 2, \"b\": 1}, False),\n            (\"a <= b\", {\"a\": 1, \"b\": 1}, True),\n            (\"a <= b\", {\"a\": 1, \"b\": 2}, True),\n            (\"a <= b\", {\"a\": 2, \"b\": 1}, False),\n            (\"a > b\", {\"a\": 1, \"b\": 1}, False),\n            (\"a > b\", {\"a\": 1, \"b\": 2}, False),\n            (\"a > b\", {\"a\": 2, \"b\": 1}, True),\n            (\"a >= b\", {\"a\": 1, \"b\": 1}, True),\n            (\"a >= b\", {\"a\": 1, \"b\": 2}, False),\n            (\"a >= b\", {\"a\": 2, \"b\": 1}, True),\n            (\"a is b\", {\"a\": 1, \"b\": 1}, True),\n            (\"a is b\", {\"a\": 1, \"b\": 2}, False),\n            (\"a is not b\", {\"a\": 1, \"b\": 1}, False),\n            (\"a is not b\", {\"a\": 1, \"b\": 2}, True),\n            (\"a in b\", {\"a\": 1, \"b\": [1, 2, 3]}, True),\n            (\"a in b\", {\"a\": 4, \"b\": [1, 2, 3]}, False),\n            (\"a not in b\", {\"a\": 1, \"b\": [1, 2, 3]}, False),\n            (\"a not in b\", {\"a\": 4, \"b\": [1, 2, 3]}, True),\n            # Chained conditions:\n            (\"a == b == c\", {\"a\": 1, \"b\": 1, \"c\": 1}, True),\n            (\"a == b == c\", {\"a\": 1, \"b\": 2, \"c\": 1}, False),\n            (\"a == b < c\", {\"a\": 2, \"b\": 2, \"c\": 2}, False),\n            (\"a == b < c\", {\"a\": 0, \"b\": 0, \"c\": 1}, True),\n        ],\n    )\n    def test_evaluate_condition(self, condition, state, expected_result):\n        condition_ast = ast.parse(condition, mode=\"eval\").body\n        result = evaluate_condition(condition_ast, state, {}, {}, [])\n        assert result == expected_result\n\n    @pytest.mark.parametrize(\n        \"condition, state, expected_result\",\n        [\n            (\"a == b\", {\"a\": pd.Series([1, 2, 3]), \"b\": pd.Series([2, 2, 2])}, pd.Series([False, True, False])),\n            (\"a != b\", {\"a\": pd.Series([1, 2, 3]), \"b\": pd.Series([2, 2, 2])}, pd.Series([True, False, True])),\n            (\"a < b\", {\"a\": pd.Series([1, 2, 3]), \"b\": pd.Series([2, 2, 2])}, pd.Series([True, False, False])),\n            (\"a <= b\", {\"a\": pd.Series([1, 2, 3]), \"b\": pd.Series([2, 2, 2])}, pd.Series([True, True, False])),\n            (\"a > b\", {\"a\": pd.Series([1, 2, 3]), \"b\": pd.Series([2, 2, 2])}, pd.Series([False, False, True])),\n            (\"a >= b\", {\"a\": pd.Series([1, 2, 3]), \"b\": pd.Series([2, 2, 2])}, pd.Series([False, True, True])),\n            (\n                \"a == b\",\n                {\"a\": pd.DataFrame({\"x\": [1, 2], \"y\": [3, 4]}), \"b\": pd.DataFrame({\"x\": [1, 2], \"y\": [3, 5]})},\n                pd.DataFrame({\"x\": [True, True], \"y\": [True, False]}),\n            ),\n            (\n                \"a != b\",\n                {\"a\": pd.DataFrame({\"x\": [1, 2], \"y\": [3, 4]}), \"b\": pd.DataFrame({\"x\": [1, 2], \"y\": [3, 5]})},\n                pd.DataFrame({\"x\": [False, False], \"y\": [False, True]}),\n            ),\n            (\n                \"a < b\",\n                {\"a\": pd.DataFrame({\"x\": [1, 2], \"y\": [3, 4]}), \"b\": pd.DataFrame({\"x\": [2, 2], \"y\": [2, 2]})},\n                pd.DataFrame({\"x\": [True, False], \"y\": [False, False]}),\n            ),\n            (\n                \"a <= b\",\n                {\"a\": pd.DataFrame({\"x\": [1, 2], \"y\": [3, 4]}), \"b\": pd.DataFrame({\"x\": [2, 2], \"y\": [2, 2]})},\n                pd.DataFrame({\"x\": [True, True], \"y\": [False, False]}),\n            ),\n            (\n                \"a > b\",\n                {\"a\": pd.DataFrame({\"x\": [1, 2], \"y\": [3, 4]}), \"b\": pd.DataFrame({\"x\": [2, 2], \"y\": [2, 2]})},\n                pd.DataFrame({\"x\": [False, False], \"y\": [True, True]}),\n            ),\n            (\n                \"a >= b\",\n                {\"a\": pd.DataFrame({\"x\": [1, 2], \"y\": [3, 4]}), \"b\": pd.DataFrame({\"x\": [2, 2], \"y\": [2, 2]})},\n                pd.DataFrame({\"x\": [False, True], \"y\": [True, True]}),\n            ),\n        ],\n    )\n    def test_evaluate_condition_with_pandas(self, condition, state, expected_result):\n        condition_ast = ast.parse(condition, mode=\"eval\").body\n        result = evaluate_condition(condition_ast, state, {}, {}, [])\n        if isinstance(result, pd.Series):\n            pd.testing.assert_series_equal(result, expected_result)\n        else:\n            pd.testing.assert_frame_equal(result, expected_result)\n\n    @pytest.mark.parametrize(\n        \"condition, state, expected_exception\",\n        [\n            # Chained conditions:\n            (\n                \"a == b == c\",\n                {\n                    \"a\": pd.Series([1, 2, 3]),\n                    \"b\": pd.Series([2, 2, 2]),\n                    \"c\": pd.Series([3, 3, 3]),\n                },\n                ValueError(\n                    \"The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().\"\n                ),\n            ),\n            (\n                \"a == b == c\",\n                {\n                    \"a\": pd.DataFrame({\"x\": [1, 2], \"y\": [3, 4]}),\n                    \"b\": pd.DataFrame({\"x\": [2, 2], \"y\": [2, 2]}),\n                    \"c\": pd.DataFrame({\"x\": [3, 3], \"y\": [3, 3]}),\n                },\n                ValueError(\n                    \"The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().\"\n                ),\n            ),\n        ],\n    )\n    def test_evaluate_condition_with_pandas_exceptions(self, condition, state, expected_exception):\n        condition_ast = ast.parse(condition, mode=\"eval\").body\n        with pytest.raises(type(expected_exception)) as exception_info:\n            _ = evaluate_condition(condition_ast, state, {}, {}, [])\n        assert str(expected_exception) in str(exception_info.value)\n\n\nclass TestEvaluateSubscript:\n    @pytest.mark.parametrize(\n        \"subscript, state, expected_result\",\n        [\n            (\"dct[1]\", {\"dct\": {1: 11, 2: 22}}, 11),\n            (\"dct[2]\", {\"dct\": {1: \"a\", 2: \"b\"}}, \"b\"),\n            (\"dct['b']\", {\"dct\": {\"a\": 1, \"b\": 2}}, 2),\n            (\"dct['a']\", {\"dct\": {\"a\": \"aa\", \"b\": \"bb\"}}, \"aa\"),\n            (\"dct[1, 2]\", {\"dct\": {(1, 2): 3}}, 3),  # tuple-index\n            (\"dct['a']['b']\", {\"dct\": {\"a\": {\"b\": 1}}}, 1),  # nested\n            (\"lst[0]\", {\"lst\": [1, 2, 3]}, 1),\n            (\"lst[-1]\", {\"lst\": [1, 2, 3]}, 3),\n            (\"lst[1:3]\", {\"lst\": [1, 2, 3, 4]}, [2, 3]),\n            (\"lst[:]\", {\"lst\": [1, 2, 3]}, [1, 2, 3]),\n            (\"lst[::2]\", {\"lst\": [1, 2, 3, 4]}, [1, 3]),\n            (\"lst[::-1]\", {\"lst\": [1, 2, 3]}, [3, 2, 1]),\n            (\"tup[1]\", {\"tup\": (1, 2, 3)}, 2),\n            (\"tup[-1]\", {\"tup\": (1, 2, 3)}, 3),\n            (\"tup[1:3]\", {\"tup\": (1, 2, 3, 4)}, (2, 3)),\n            (\"tup[:]\", {\"tup\": (1, 2, 3)}, (1, 2, 3)),\n            (\"tup[::2]\", {\"tup\": (1, 2, 3, 4)}, (1, 3)),\n            (\"tup[::-1]\", {\"tup\": (1, 2, 3)}, (3, 2, 1)),\n            (\"st[1]\", {\"str\": \"abc\"}, \"b\"),\n            (\"st[-1]\", {\"str\": \"abc\"}, \"c\"),\n            (\"st[1:3]\", {\"str\": \"abcd\"}, \"bc\"),\n            (\"st[:]\", {\"str\": \"abc\"}, \"abc\"),\n            (\"st[::2]\", {\"str\": \"abcd\"}, \"ac\"),\n            (\"st[::-1]\", {\"str\": \"abc\"}, \"cba\"),\n            (\"arr[1]\", {\"arr\": np.array([1, 2, 3])}, 2),\n            (\"arr[1:3]\", {\"arr\": np.array([1, 2, 3, 4])}, np.array([2, 3])),\n            (\"arr[:]\", {\"arr\": np.array([1, 2, 3])}, np.array([1, 2, 3])),\n            (\"arr[::2]\", {\"arr\": np.array([1, 2, 3, 4])}, np.array([1, 3])),\n            (\"arr[::-1]\", {\"arr\": np.array([1, 2, 3])}, np.array([3, 2, 1])),\n            (\"arr[1, 2]\", {\"arr\": np.array([[1, 2, 3], [4, 5, 6]])}, 6),\n            (\"ser[1]\", {\"ser\": pd.Series([1, 2, 3])}, 2),\n            (\"ser.loc[1]\", {\"ser\": pd.Series([1, 2, 3])}, 2),\n            (\"ser.loc[1]\", {\"ser\": pd.Series([1, 2, 3], index=[2, 3, 1])}, 3),\n            (\"ser.iloc[1]\", {\"ser\": pd.Series([1, 2, 3])}, 2),\n            (\"ser.iloc[1]\", {\"ser\": pd.Series([1, 2, 3], index=[2, 3, 1])}, 2),\n            (\"ser.at[1]\", {\"ser\": pd.Series([1, 2, 3])}, 2),\n            (\"ser.at[1]\", {\"ser\": pd.Series([1, 2, 3], index=[2, 3, 1])}, 3),\n            (\"ser.iat[1]\", {\"ser\": pd.Series([1, 2, 3])}, 2),\n            (\"ser.iat[1]\", {\"ser\": pd.Series([1, 2, 3], index=[2, 3, 1])}, 2),\n            (\"ser[1:3]\", {\"ser\": pd.Series([1, 2, 3, 4])}, pd.Series([2, 3], index=[1, 2])),\n            (\"ser[:]\", {\"ser\": pd.Series([1, 2, 3])}, pd.Series([1, 2, 3])),\n            (\"ser[::2]\", {\"ser\": pd.Series([1, 2, 3, 4])}, pd.Series([1, 3], index=[0, 2])),\n            (\"ser[::-1]\", {\"ser\": pd.Series([1, 2, 3])}, pd.Series([3, 2, 1], index=[2, 1, 0])),\n            (\"df['y'][1]\", {\"df\": pd.DataFrame({\"x\": [1, 2], \"y\": [3, 4]})}, 4),\n            (\"df['y'][5]\", {\"df\": pd.DataFrame({\"x\": [1, 2], \"y\": [3, 4]}, index=[5, 6])}, 3),\n            (\"df.loc[1, 'y']\", {\"df\": pd.DataFrame({\"x\": [1, 2], \"y\": [3, 4]})}, 4),\n            (\"df.loc[5, 'y']\", {\"df\": pd.DataFrame({\"x\": [1, 2], \"y\": [3, 4]}, index=[5, 6])}, 3),\n            (\"df.iloc[1, 1]\", {\"df\": pd.DataFrame({\"x\": [1, 2], \"y\": [3, 4]})}, 4),\n            (\"df.iloc[1, 1]\", {\"df\": pd.DataFrame({\"x\": [1, 2], \"y\": [3, 4]}, index=[5, 6])}, 4),\n            (\"df.at[1, 'y']\", {\"df\": pd.DataFrame({\"x\": [1, 2], \"y\": [3, 4]})}, 4),\n            (\"df.at[5, 'y']\", {\"df\": pd.DataFrame({\"x\": [1, 2], \"y\": [3, 4]}, index=[5, 6])}, 3),\n            (\"df.iat[1, 1]\", {\"df\": pd.DataFrame({\"x\": [1, 2], \"y\": [3, 4]})}, 4),\n            (\"df.iat[1, 1]\", {\"df\": pd.DataFrame({\"x\": [1, 2], \"y\": [3, 4]}, index=[5, 6])}, 4),\n        ],\n    )\n    def test_evaluate_subscript(self, subscript, state, expected_result):\n        subscript_ast = ast.parse(subscript).body[0].value\n        result = evaluate_subscript(subscript_ast, state, {}, {}, [])\n        try:\n            assert result == expected_result\n        except ValueError:\n            assert (result == expected_result).all()\n\n    @pytest.mark.parametrize(\n        \"subscript, state, expected_error_message\",\n        [\n            (\"dct['a']\", {\"dct\": {}}, \"KeyError: 'a'\"),\n            (\"dct[0]\", {\"dct\": {}}, \"KeyError: 0\"),\n            (\"dct['c']\", {\"dct\": {\"a\": 1, \"b\": 2}}, \"KeyError: 'c'\"),\n            (\"dct[1, 2, 3]\", {\"dct\": {(1, 2): 3}}, \"KeyError: (1, 2, 3)\"),\n            (\"lst[0]\", {\"lst\": []}, \"IndexError: list index out of range\"),\n            (\"lst[3]\", {\"lst\": [1, 2, 3]}, \"IndexError: list index out of range\"),\n            (\"lst[-4]\", {\"lst\": [1, 2, 3]}, \"IndexError: list index out of range\"),\n            (\"value[0]\", {\"value\": 1}, \"TypeError: 'int' object is not subscriptable\"),\n        ],\n    )\n    def test_evaluate_subscript_error(self, subscript, state, expected_error_message):\n        subscript_ast = ast.parse(subscript).body[0].value\n        with pytest.raises(InterpreterError, match=\"Could not index\") as exception_info:\n            _ = evaluate_subscript(subscript_ast, state, {}, {}, [])\n        assert expected_error_message in str(exception_info.value)\n\n    @pytest.mark.parametrize(\n        \"subscriptable_class, expectation\",\n        [\n            (True, 20),\n            (False, InterpreterError(\"TypeError: 'Custom' object is not subscriptable\")),\n        ],\n    )\n    def test_evaluate_subscript_with_custom_class(self, subscriptable_class, expectation):\n        if subscriptable_class:\n\n            class Custom:\n                def __getitem__(self, key):\n                    return key * 10\n        else:\n\n            class Custom:\n                pass\n\n        state = {\"obj\": Custom()}\n        subscript = \"obj[2]\"\n        subscript_ast = ast.parse(subscript).body[0].value\n        if isinstance(expectation, Exception):\n            with pytest.raises(type(expectation), match=\"Could not index\") as exception_info:\n                evaluate_subscript(subscript_ast, state, {}, {}, [])\n            assert \"TypeError: 'Custom' object is not subscriptable\" in str(exception_info.value)\n        else:\n            result = evaluate_subscript(subscript_ast, state, {}, {}, [])\n            assert result == expectation\n\n\ndef test_get_safe_module_handle_lazy_imports():\n    class FakeModule(types.ModuleType):\n        def __init__(self, name):\n            super().__init__(name)\n            self.non_lazy_attribute = \"ok\"\n\n        def __getattr__(self, name):\n            if name == \"lazy_attribute\":\n                raise ImportError(\"lazy import failure\")\n            return super().__getattr__(name)\n\n        def __dir__(self):\n            return super().__dir__() + [\"lazy_attribute\"]\n\n    fake_module = FakeModule(\"fake_module\")\n    safe_module = get_safe_module(fake_module, authorized_imports=set())\n    assert not hasattr(safe_module, \"lazy_attribute\")\n    assert getattr(safe_module, \"non_lazy_attribute\") == \"ok\"\n\n\nclass TestPrintContainer:\n    def test_initial_value(self):\n        pc = PrintContainer()\n        assert pc.value == \"\"\n\n    def test_append(self):\n        pc = PrintContainer()\n        pc.append(\"Hello\")\n        assert pc.value == \"Hello\"\n\n    def test_iadd(self):\n        pc = PrintContainer()\n        pc += \"World\"\n        assert pc.value == \"World\"\n\n    def test_str(self):\n        pc = PrintContainer()\n        pc.append(\"Hello\")\n        assert str(pc) == \"Hello\"\n\n    def test_repr(self):\n        pc = PrintContainer()\n        pc.append(\"Hello\")\n        assert repr(pc) == \"PrintContainer(Hello)\"\n\n    def test_len(self):\n        pc = PrintContainer()\n        pc.append(\"Hello\")\n        assert len(pc) == 5\n\n\ndef test_fix_final_answer_code():\n    test_cases = [\n        (\n            \"final_answer = 3.21\\nfinal_answer(final_answer)\",\n            \"final_answer_variable = 3.21\\nfinal_answer(final_answer_variable)\",\n        ),\n        (\n            \"x = final_answer(5)\\nfinal_answer = x + 1\\nfinal_answer(final_answer)\",\n            \"x = final_answer(5)\\nfinal_answer_variable = x + 1\\nfinal_answer(final_answer_variable)\",\n        ),\n        (\n            \"def func():\\n    final_answer = 42\\n    return final_answer(final_answer)\",\n            \"def func():\\n    final_answer_variable = 42\\n    return final_answer(final_answer_variable)\",\n        ),\n        (\n            \"final_answer(5)  # Should not change function calls\",\n            \"final_answer(5)  # Should not change function calls\",\n        ),\n        (\n            \"obj.final_answer = 5  # Should not change object attributes\",\n            \"obj.final_answer = 5  # Should not change object attributes\",\n        ),\n        (\n            \"final_answer=3.21;final_answer(final_answer)\",\n            \"final_answer_variable=3.21;final_answer(final_answer_variable)\",\n        ),\n    ]\n\n    for i, (input_code, expected) in enumerate(test_cases, 1):\n        result = fix_final_answer_code(input_code)\n        assert result == expected, f\"\"\"\nTest case {i} failed:\nInput:    {input_code}\nExpected: {expected}\nGot:      {result}\n\"\"\"\n\n\nclass TestTimeout:\n    \"\"\"Test the timeout mechanism for code execution.\"\"\"\n\n    def test_timeout_decorator_completes_within_limit(self):\n        \"\"\"Test that code completing within the timeout limit works correctly.\"\"\"\n\n        @timeout(2)\n        def short_task():\n            time.sleep(0.1)\n            return \"completed\"\n\n        assert short_task() == \"completed\"\n\n    def test_timeout_decorator_raises_error_when_exceeded(self):\n        \"\"\"Test that code exceeding the timeout limit raises ExecutionTimeoutError.\"\"\"\n\n        @timeout(1)\n        def long_task():\n            time.sleep(2)\n            return \"should not complete\"\n\n        with pytest.raises(ExecutionTimeoutError, match=\"Code execution exceeded the maximum execution time\"):\n            long_task()\n\n    def test_evaluate_python_code_with_timeout_completes(self):\n        \"\"\"Test that evaluate_python_code completes within timeout for quick code.\"\"\"\n        code = \"result = 2 + 2\"\n        result, is_final = evaluate_python_code(code)\n        assert result == 4\n\n    def test_evaluate_python_code_with_timeout_raises(self):\n        \"\"\"Test that evaluate_python_code raises timeout error for long-running code.\"\"\"\n        # Use a short custom timeout (2 seconds) with longer sleep (3 seconds) to test quickly\n        code = \"\"\"\nimport time\ntime.sleep(3)\nresult = \"should not complete\"\n\"\"\"\n        with pytest.raises(ExecutionTimeoutError, match=\"Code execution exceeded the maximum execution time\"):\n            evaluate_python_code(code, authorized_imports=[\"time\"], timeout_seconds=2)\n\n    def test_timeout_works_in_thread(self):\n        \"\"\"Test that timeout mechanism works when called from a non-main thread.\n\n        This verifies the fix for the issue where signal-based timeouts failed\n        in threads (signals only work in the main thread).\n        \"\"\"\n        import threading\n\n        result = {\"success\": False, \"error\": None}\n\n        def run_in_thread():\n            try:\n                # Quick code should work\n                code = \"result = 42\"\n                res, _ = evaluate_python_code(code)\n                assert res == 42\n\n                # Timeout should still work in thread - use short timeout for fast test\n                timeout_code = \"\"\"\nimport time\ntime.sleep(3)\n\"\"\"\n                try:\n                    evaluate_python_code(timeout_code, authorized_imports=[\"time\"], timeout_seconds=2)\n                    result[\"error\"] = \"Code should have timed out but didn't\"\n                except ExecutionTimeoutError:\n                    result[\"success\"] = True\n            except Exception as e:\n                result[\"error\"] = f\"{type(e).__name__}: {e}\"\n\n        thread = threading.Thread(target=run_in_thread)\n        thread.start()\n        thread.join(timeout=10)\n\n        assert not thread.is_alive(), \"Thread should have completed\"\n        assert result[\"error\"] is None, f\"Error in thread: {result['error']}\"\n        assert result[\"success\"], \"Timeout should have been raised in thread\"\n\n    def test_custom_timeout_value(self):\n        \"\"\"Test that a custom timeout value can be specified.\"\"\"\n        # Code that sleeps for 2 seconds should timeout with 1-second limit\n        code = \"\"\"\nimport time\ntime.sleep(2)\n\"\"\"\n        with pytest.raises(ExecutionTimeoutError, match=\"Code execution exceeded the maximum execution time of 1\"):\n            evaluate_python_code(code, authorized_imports=[\"time\"], timeout_seconds=1)\n\n    def test_longer_timeout_value(self):\n        \"\"\"Test that a longer custom timeout value allows longer execution.\"\"\"\n        # Code that sleeps for 2 seconds should complete with 5-second limit\n        code = \"\"\"\nimport time\ntime.sleep(2)\nresult = \"completed\"\n\"\"\"\n        result, is_final = evaluate_python_code(code, authorized_imports=[\"time\"], timeout_seconds=5)\n        assert result == \"completed\"\n\n    def test_disabled_timeout(self):\n        \"\"\"Test that timeout can be disabled by setting it to None.\"\"\"\n        # Even slow code should complete when timeout is disabled\n        # Using a shorter sleep to keep test fast, but demonstrating None works\n        code = \"\"\"\nimport time\ntime.sleep(0.5)\nresult = \"completed without timeout\"\n\"\"\"\n        result, is_final = evaluate_python_code(code, authorized_imports=[\"time\"], timeout_seconds=None)\n        assert result == \"completed without timeout\"\n\n    def test_local_executor_custom_timeout(self):\n        \"\"\"Test that LocalPythonExecutor respects custom timeout.\"\"\"\n        executor = LocalPythonExecutor(additional_authorized_imports=[\"time\"], timeout_seconds=1)\n        executor.send_tools({})\n\n        # Code that sleeps for 2 seconds should timeout with 1-second executor limit\n        code = \"\"\"\nimport time\ntime.sleep(2)\n\"\"\"\n        with pytest.raises(ExecutionTimeoutError, match=\"Code execution exceeded the maximum execution time of 1\"):\n            executor(code)\n\n    def test_local_executor_disabled_timeout(self):\n        \"\"\"Test that LocalPythonExecutor can disable timeout.\"\"\"\n        executor = LocalPythonExecutor(additional_authorized_imports=[\"time\"], timeout_seconds=None)\n        executor.send_tools({})\n\n        # Code should complete even without timeout\n        code = \"\"\"\nimport time\ntime.sleep(0.5)\nresult = \"completed\"\n\"\"\"\n        output = executor(code)\n        assert output.output == \"completed\"\n\n\n@pytest.mark.parametrize(\n    \"module,authorized_imports,expected\",\n    [\n        (\"os\", [\"other\", \"*\"], True),\n        (\"AnyModule\", [\"*\"], True),\n        (\"os\", [\"os\"], True),\n        (\"AnyModule\", [\"AnyModule\"], True),\n        (\"Module.os\", [\"Module\"], False),\n        (\"Module.os\", [\"Module\", \"Module.os\"], True),\n        (\"os.path\", [\"os.*\"], True),\n        (\"os\", [\"os.path\"], True),\n    ],\n)\ndef test_check_import_authorized(module: str, authorized_imports: list[str], expected: bool):\n    assert check_import_authorized(module, authorized_imports) == expected\n\n\nclass TestLocalPythonExecutor:\n    @pytest.mark.parametrize(\n        \"additional_authorized_imports, should_raise\",\n        [\n            # Valid imports\n            ([\"math\"], None),\n            ([\"math\", \"os\"], None),  # Multiple valid imports\n            ([], None),  # Empty list of imports\n            ([\"*\"], None),  # Wildcard allows all imports\n            ([\"os.*\"], None),  # Submodule wildcard\n            # Invalid imports\n            ([\"i_do_not_exist\"], True),  # Non-existent module\n            ([\"math\", \"i_do_not_exist\"], True),  # Mix of valid and invalid\n            ([\"i_do_not_exist.*\"], True),  # Non-existent module with wildcard\n        ],\n    )\n    def test_additional_authorized_imports_are_installed(self, additional_authorized_imports, should_raise):\n        expectation = (\n            pytest.raises(InterpreterError, match=\"Non-installed authorized modules\")\n            if should_raise\n            else does_not_raise()\n        )\n        with expectation:\n            LocalPythonExecutor(additional_authorized_imports=additional_authorized_imports)\n\n    def test_state_name(self):\n        executor = LocalPythonExecutor(additional_authorized_imports=[])\n        assert executor.state.get(\"__name__\") == \"__main__\"\n\n    @pytest.mark.parametrize(\n        \"code\",\n        [\n            \"d = {'func': lambda x: x + 10}; func = d['func']; func(1)\",\n            \"d = {'func': lambda x: x + 10}; d['func'](1)\",\n        ],\n    )\n    def test_call_from_dict(self, code):\n        executor = LocalPythonExecutor([])\n        result = executor(code).output\n        assert result == 11\n\n    @pytest.mark.parametrize(\n        \"code\",\n        [\n            \"a = b = 1; a\",\n            \"a = b = 1; b\",\n            \"a, b = c, d = 1, 1; a\",\n            \"a, b = c, d = 1, 1; b\",\n            \"a, b = c, d = 1, 1; c\",\n            \"a, b = c, d = {1, 2}; a\",\n            \"a, b = c, d = {1, 2}; c\",\n            \"a, b = c, d = {1: 10, 2: 20}; a\",\n            \"a, b = c, d = {1: 10, 2: 20}; c\",\n            \"a = b = (lambda: 1)(); b\",\n            \"a = b = (lambda: 1)(); lambda x: 10; b\",\n            \"a = b = (lambda x: lambda y: x + y)(0)(1); b\",\n            dedent(\"\"\"\n            def foo():\n                return 1;\n            a = b = foo(); b\"\"\"),\n            dedent(\"\"\"\n            def foo(*args, **kwargs):\n                return sum(args)\n            a = b = foo(1,-1,1); b\"\"\"),\n            \"a, b = 1, 2; a, b = b, a; b\",\n        ],\n    )\n    def test_chained_assignments(self, code):\n        executor = LocalPythonExecutor([])\n        executor.send_tools({})\n        result = executor(code).output\n        assert result == 1\n\n    def test_evaluate_assign_error(self):\n        code = \"a, b = 1, 2, 3; a\"\n        executor = LocalPythonExecutor([])\n        with pytest.raises(InterpreterError, match=\".*Cannot unpack tuple of wrong size\"):\n            executor(code)\n\n    def test_function_def_recovers_source_code(self):\n        executor = LocalPythonExecutor([])\n        executor.send_tools({\"final_answer\": FinalAnswerTool()})\n        res = executor(\n            dedent(\n                \"\"\"\n                def target_function():\n                    return \"Hello world\"\n\n                final_answer(target_function)\n                \"\"\"\n            )\n        ).output\n        assert res.__name__ == \"target_function\"\n        assert res.__source__ == \"def target_function():\\n    return 'Hello world'\"\n\n    @pytest.mark.parametrize(\n        \"code, expected_result\",\n        [(\"isinstance(5, int)\", True), (\"isinstance('foo', str)\", True), (\"isinstance(5, str)\", False)],\n    )\n    def test_isinstance_builtin_type(self, code, expected_result):\n        executor = LocalPythonExecutor([])\n        executor.send_tools({})\n        result = executor(code).output\n        assert result is expected_result\n\n\nclass TestLocalPythonExecutorSecurity:\n    @pytest.mark.parametrize(\n        \"additional_authorized_imports, expected_error\",\n        [([], InterpreterError(\"Import of os is not allowed\")), ([\"os\"], None)],\n    )\n    def test_vulnerability_import(self, additional_authorized_imports, expected_error):\n        executor = LocalPythonExecutor(additional_authorized_imports)\n        with (\n            pytest.raises(type(expected_error), match=f\".*{expected_error}\")\n            if isinstance(expected_error, Exception)\n            else does_not_raise()\n        ):\n            executor(\"import os\")\n\n    @pytest.mark.parametrize(\n        \"additional_authorized_imports, expected_error\",\n        [([], InterpreterError(\"Import of builtins is not allowed\")), ([\"builtins\"], None)],\n    )\n    def test_vulnerability_builtins(self, additional_authorized_imports, expected_error):\n        executor = LocalPythonExecutor(additional_authorized_imports)\n        with (\n            pytest.raises(type(expected_error), match=f\".*{expected_error}\")\n            if isinstance(expected_error, Exception)\n            else does_not_raise()\n        ):\n            executor(\"import builtins\")\n\n    @pytest.mark.parametrize(\n        \"additional_authorized_imports, expected_error\",\n        [([], InterpreterError(\"Import of builtins is not allowed\")), ([\"builtins\"], None)],\n    )\n    def test_vulnerability_builtins_safe_functions(self, additional_authorized_imports, expected_error):\n        executor = LocalPythonExecutor(additional_authorized_imports)\n        with (\n            pytest.raises(type(expected_error), match=f\".*{expected_error}\")\n            if isinstance(expected_error, Exception)\n            else does_not_raise()\n        ):\n            executor(\"import builtins; builtins.print(1)\")\n\n    @pytest.mark.parametrize(\n        \"additional_authorized_imports, additional_tools, expected_error\",\n        [\n            ([], [], InterpreterError(\"Import of builtins is not allowed\")),\n            ([\"builtins\"], [], InterpreterError(\"Forbidden access to function: exec\")),\n            ([\"builtins\"], [\"exec\"], None),\n        ],\n    )\n    def test_vulnerability_builtins_dangerous_functions(\n        self, additional_authorized_imports, additional_tools, expected_error\n    ):\n        executor = LocalPythonExecutor(additional_authorized_imports)\n        if additional_tools:\n            from builtins import exec\n\n            executor.send_tools({\"exec\": exec})\n        with (\n            pytest.raises(type(expected_error), match=f\".*{expected_error}\")\n            if isinstance(expected_error, Exception)\n            else does_not_raise()\n        ):\n            executor(\"import builtins; builtins.exec\")\n\n    @pytest.mark.parametrize(\n        \"additional_authorized_imports, additional_tools, expected_error\",\n        [\n            ([], [], InterpreterError(\"Import of os is not allowed\")),\n            ([\"os\"], [], InterpreterError(\"Forbidden access to function: popen\")),\n            ([\"os\"], [\"popen\"], None),\n        ],\n    )\n    def test_vulnerability_dangerous_functions(self, additional_authorized_imports, additional_tools, expected_error):\n        executor = LocalPythonExecutor(additional_authorized_imports)\n        if additional_tools:\n            from os import popen\n\n            executor.send_tools({\"popen\": popen})\n        with (\n            pytest.raises(type(expected_error), match=f\".*{expected_error}\")\n            if isinstance(expected_error, Exception)\n            else does_not_raise()\n        ):\n            executor(\"import os; os.popen\")\n\n    @pytest.mark.parametrize(\"dangerous_function\", DANGEROUS_FUNCTIONS)\n    def test_vulnerability_for_all_dangerous_functions(self, dangerous_function):\n        dangerous_module_name, dangerous_function_name = dangerous_function.rsplit(\".\", 1)\n        # Skip test if module is not installed: posix module is not installed on Windows\n        pytest.importorskip(dangerous_module_name)\n        executor = LocalPythonExecutor([dangerous_module_name])\n        if \"__\" in dangerous_function_name:\n            error_match = f\".*Forbidden access to dunder attribute: {dangerous_function_name}\"\n        else:\n            error_match = f\".*Forbidden access to function: {dangerous_function_name}.*\"\n        with pytest.raises(InterpreterError, match=error_match):\n            executor(f\"import {dangerous_module_name}; {dangerous_function}\")\n\n    @pytest.mark.parametrize(\n        \"additional_authorized_imports, expected_error\",\n        [\n            ([], InterpreterError(\"Import of sys is not allowed\")),\n            ([\"sys\"], InterpreterError(\"Forbidden access to module: os\")),\n            ([\"sys\", \"os\"], None),\n        ],\n    )\n    def test_vulnerability_via_sys(self, additional_authorized_imports, expected_error):\n        executor = LocalPythonExecutor(additional_authorized_imports)\n        with (\n            pytest.raises(type(expected_error), match=f\".*{expected_error}\")\n            if isinstance(expected_error, Exception)\n            else does_not_raise()\n        ):\n            executor(\n                dedent(\n                    \"\"\"\n                    import sys\n                    sys.modules[\"os\"].system(\":\")\n                    \"\"\"\n                )\n            )\n\n    @pytest.mark.parametrize(\"dangerous_module\", DANGEROUS_MODULES)\n    def test_vulnerability_via_sys_for_all_dangerous_modules(self, dangerous_module):\n        import sys\n\n        if dangerous_module not in sys.modules or dangerous_module == \"sys\":\n            pytest.skip(\"module not present in sys.modules\")\n        executor = LocalPythonExecutor([\"sys\"])\n        with pytest.raises(InterpreterError) as exception_info:\n            executor(\n                dedent(\n                    f\"\"\"\n                    import sys\n                    sys.modules[\"{dangerous_module}\"]\n                    \"\"\"\n                )\n            )\n        assert f\"Forbidden access to module: {dangerous_module}\" in str(exception_info.value)\n\n    @pytest.mark.parametrize(\n        \"additional_authorized_imports, expected_error\",\n        [([\"importlib\"], InterpreterError(\"Forbidden access to module: os\")), ([\"importlib\", \"os\"], None)],\n    )\n    def test_vulnerability_via_importlib(self, additional_authorized_imports, expected_error):\n        executor = LocalPythonExecutor(additional_authorized_imports)\n        with (\n            pytest.raises(type(expected_error), match=f\".*{expected_error}\")\n            if isinstance(expected_error, Exception)\n            else does_not_raise()\n        ):\n            executor(\n                dedent(\n                    \"\"\"\n                    import importlib\n                    importlib.import_module(\"os\").system(\":\")\n                    \"\"\"\n                )\n            )\n\n    @pytest.mark.parametrize(\n        \"code, additional_authorized_imports, expected_error\",\n        [\n            # os submodule\n            (\n                \"import queue; queue.threading._os.system(':')\",\n                [],\n                InterpreterError(\"Forbidden access to module: threading\"),\n            ),\n            (\n                \"import queue; queue.threading._os.system(':')\",\n                [\"threading\"],\n                InterpreterError(\"Forbidden access to module: os\"),\n            ),\n            (\"import random; random._os.system(':')\", [], InterpreterError(\"Forbidden access to module: os\")),\n            (\n                \"import random; random.__dict__['_os'].system(':')\",\n                [],\n                InterpreterError(\"Forbidden access to dunder attribute: __dict__\"),\n            ),\n            (\n                \"import doctest; doctest.inspect.os.system(':')\",\n                [\"doctest\"],\n                InterpreterError(\"Forbidden access to module: inspect\"),\n            ),\n            (\n                \"import doctest; doctest.inspect.os.system(':')\",\n                [\"doctest\", \"inspect\"],\n                InterpreterError(\"Forbidden access to module: os\"),\n            ),\n            # subprocess submodule\n            (\n                \"import asyncio; asyncio.base_events.events.subprocess\",\n                [\"asyncio\"],\n                InterpreterError(\"Forbidden access to module: asyncio.base_events\"),\n            ),\n            (\n                \"import asyncio; asyncio.base_events.events.subprocess\",\n                [\"asyncio\", \"asyncio.base_events\"],\n                InterpreterError(\"Forbidden access to module: asyncio.events\"),\n            ),\n            (\n                \"import asyncio; asyncio.base_events.events.subprocess\",\n                [\"asyncio\", \"asyncio.base_events\", \"asyncio.base_events.events\"],\n                InterpreterError(\"Forbidden access to module: asyncio.events\"),\n            ),\n            # sys submodule\n            (\n                \"import queue; queue.threading._sys.modules['os'].system(':')\",\n                [],\n                InterpreterError(\"Forbidden access to module: threading\"),\n            ),\n            (\n                \"import queue; queue.threading._sys.modules['os'].system(':')\",\n                [\"threading\"],\n                InterpreterError(\"Forbidden access to module: sys\"),\n            ),\n            (\"import warnings; warnings.sys\", [\"warnings\"], InterpreterError(\"Forbidden access to module: sys\")),\n            # Allowed\n            (\"import pandas; pandas.io\", [\"pandas\", \"pandas.io\"], None),\n        ],\n    )\n    def test_vulnerability_via_submodules(self, code, additional_authorized_imports, expected_error):\n        executor = LocalPythonExecutor(additional_authorized_imports)\n        with (\n            pytest.raises(type(expected_error), match=f\".*{expected_error}\")\n            if isinstance(expected_error, Exception)\n            else does_not_raise()\n        ):\n            executor(code)\n\n    @pytest.mark.parametrize(\n        \"code, additional_authorized_imports, expected_error\",\n        [\n            # Using filter with functools.partial\n            (\n                dedent(\n                    \"\"\"\n                    import functools\n                    import warnings\n                    list(filter(functools.partial(getattr, warnings), [\"sys\"]))\n                    \"\"\"\n                ),\n                [\"warnings\", \"functools\"],\n                InterpreterError(\"Forbidden access to module: sys\"),\n            ),\n            # Using map\n            (\n                dedent(\n                    \"\"\"\n                    import warnings\n                    list(map(getattr, [warnings], [\"sys\"]))\n                    \"\"\"\n                ),\n                [\"warnings\"],\n                InterpreterError(\"Forbidden access to module: sys\"),\n            ),\n            # Using map with functools.partial\n            (\n                dedent(\n                    \"\"\"\n                    import functools\n                    import warnings\n                    list(map(functools.partial(getattr, warnings), [\"sys\"]))\n                    \"\"\"\n                ),\n                [\"warnings\", \"functools\"],\n                InterpreterError(\"Forbidden access to module: sys\"),\n            ),\n        ],\n    )\n    def test_vulnerability_via_submodules_through_indirect_attribute_access(\n        self, code, additional_authorized_imports, expected_error\n    ):\n        # warnings.sys\n        executor = LocalPythonExecutor(additional_authorized_imports)\n        executor.send_tools({})\n        with pytest.raises(type(expected_error), match=f\".*{expected_error}\"):\n            executor(code)\n\n    @pytest.mark.parametrize(\n        \"additional_authorized_imports, additional_tools, expected_error\",\n        [\n            ([], [], InterpreterError(\"Import of sys is not allowed\")),\n            ([\"sys\"], [], InterpreterError(\"Forbidden access to module: builtins\")),\n            (\n                [\"sys\", \"builtins\"],\n                [],\n                InterpreterError(\"Forbidden access to function: __import__\"),\n            ),\n            ([\"sys\", \"builtins\"], [\"__import__\"], InterpreterError(\"Forbidden access to module: os\")),\n            ([\"sys\", \"builtins\", \"os\"], [\"__import__\"], None),\n        ],\n    )\n    def test_vulnerability_builtins_via_sys(self, additional_authorized_imports, additional_tools, expected_error):\n        executor = LocalPythonExecutor(additional_authorized_imports)\n        if additional_tools:\n            from builtins import __import__\n\n            executor.send_tools({\"__import__\": __import__})\n        with (\n            pytest.raises(type(expected_error), match=f\".*{expected_error}\")\n            if isinstance(expected_error, Exception)\n            else does_not_raise()\n        ):\n            executor(\n                dedent(\n                    \"\"\"\n                    import sys\n                    builtins = sys._getframe().f_builtins\n                    builtins_import = builtins[\"__import__\"]\n                    os_module = builtins_import(\"os\")\n                    os_module.system(\":\")\n                    \"\"\"\n                )\n            )\n\n    @pytest.mark.parametrize(\"patch_builtin_import_module\", [False, True])  # builtins_import.__module__ = None\n    @pytest.mark.parametrize(\n        \"additional_authorized_imports, additional_tools, expected_error\",\n        [\n            ([], [], InterpreterError(\"Forbidden access to dunder attribute: __traceback__\")),\n            (\n                [\"builtins\", \"os\"],\n                [\"__import__\"],\n                InterpreterError(\"Forbidden access to dunder attribute: __traceback__\"),\n            ),\n        ],\n    )\n    def test_vulnerability_builtins_via_traceback(\n        self, patch_builtin_import_module, additional_authorized_imports, additional_tools, expected_error, monkeypatch\n    ):\n        if patch_builtin_import_module:\n            monkeypatch.setattr(\"builtins.__import__.__module__\", None)  # inspect.getmodule(func) = None\n        executor = LocalPythonExecutor(additional_authorized_imports)\n        if additional_tools:\n            from builtins import __import__\n\n            executor.send_tools({\"__import__\": __import__})\n        with (\n            pytest.raises(type(expected_error), match=f\".*{expected_error}\")\n            if isinstance(expected_error, Exception)\n            else does_not_raise()\n        ):\n            executor(\n                dedent(\n                    \"\"\"\n                    try:\n                        1 / 0\n                    except Exception as e:\n                        builtins = e.__traceback__.tb_frame.f_back.f_globals[\"__builtins__\"]\n                        builtins_import = builtins[\"__import__\"]\n                        os_module = builtins_import(\"os\")\n                        os_module.system(\":\")\n                    \"\"\"\n                )\n            )\n\n    @pytest.mark.parametrize(\"patch_builtin_import_module\", [False, True])  # builtins_import.__module__ = None\n    @pytest.mark.parametrize(\n        \"additional_authorized_imports, additional_tools, expected_error\",\n        [\n            ([], [], InterpreterError(\"Forbidden access to dunder attribute: __base__\")),\n            ([\"warnings\"], [], InterpreterError(\"Forbidden access to dunder attribute: __base__\")),\n            (\n                [\"warnings\", \"builtins\"],\n                [],\n                InterpreterError(\"Forbidden access to dunder attribute: __base__\"),\n            ),\n            ([\"warnings\", \"builtins\", \"os\"], [], InterpreterError(\"Forbidden access to dunder attribute: __base__\")),\n            (\n                [\"warnings\", \"builtins\", \"os\"],\n                [\"__import__\"],\n                InterpreterError(\"Forbidden access to dunder attribute: __base__\"),\n            ),\n        ],\n    )\n    def test_vulnerability_builtins_via_class_catch_warnings(\n        self, patch_builtin_import_module, additional_authorized_imports, additional_tools, expected_error, monkeypatch\n    ):\n        if patch_builtin_import_module:\n            monkeypatch.setattr(\"builtins.__import__.__module__\", None)  # inspect.getmodule(func) = None\n        executor = LocalPythonExecutor(additional_authorized_imports)\n        if additional_tools:\n            from builtins import __import__\n\n            executor.send_tools({\"__import__\": __import__})\n        if isinstance(expected_error, tuple):  # different error depending on patch status\n            expected_error = expected_error[patch_builtin_import_module]\n        if isinstance(expected_error, Exception):\n            expectation = pytest.raises(type(expected_error), match=f\".*{expected_error}\")\n        elif expected_error is None:\n            expectation = does_not_raise()\n        with expectation:\n            executor(\n                dedent(\n                    \"\"\"\n                    classes = {}.__class__.__base__.__subclasses__()\n                    for cls in classes:\n                        if cls.__name__ == \"catch_warnings\":\n                            break\n                    builtins = cls()._module.__builtins__\n                    builtins_import = builtins[\"__import__\"]\n                    os_module = builtins_import('os')\n                    os_module.system(\":\")\n                    \"\"\"\n                )\n            )\n\n    @pytest.mark.filterwarnings(\"ignore::DeprecationWarning\")\n    @pytest.mark.parametrize(\n        \"additional_authorized_imports, expected_error\",\n        [\n            ([], InterpreterError(\"Forbidden access to dunder attribute: __base__\")),\n            ([\"os\"], InterpreterError(\"Forbidden access to dunder attribute: __base__\")),\n        ],\n    )\n    def test_vulnerability_load_module_via_builtin_importer(self, additional_authorized_imports, expected_error):\n        executor = LocalPythonExecutor(additional_authorized_imports)\n        with (\n            pytest.raises(type(expected_error), match=f\".*{expected_error}\")\n            if isinstance(expected_error, Exception)\n            else does_not_raise()\n        ):\n            executor(\n                dedent(\n                    \"\"\"\n                    classes = {}.__class__.__base__.__subclasses__()\n                    for cls in classes:\n                        if cls.__name__ == \"BuiltinImporter\":\n                            break\n                    os_module = cls().load_module(\"os\")\n                    os_module.system(\":\")\n                    \"\"\"\n                )\n            )\n\n    def test_vulnerability_class_via_subclasses(self):\n        # Subclass: subprocess.Popen\n        executor = LocalPythonExecutor([])\n        code = dedent(\n            \"\"\"\n            for cls in ().__class__.__base__.__subclasses__():\n                if 'Popen' in cls.__class__.__repr__(cls):\n                    break\n            cls([\"sh\", \"-c\", \":\"]).wait()\n            \"\"\"\n        )\n        with pytest.raises(InterpreterError, match=\"Forbidden access to dunder attribute: __base__\"):\n            executor(code)\n\n        code = dedent(\n            \"\"\"\n            [c for c in ().__class__.__base__.__subclasses__() if \"Popen\" in c.__class__.__repr__(c)][0](\n                [\"sh\", \"-c\", \":\"]\n            ).wait()\n            \"\"\"\n        )\n        with pytest.raises(InterpreterError, match=\"Forbidden access to dunder attribute: __base__\"):\n            executor(code)\n\n    @pytest.mark.parametrize(\n        \"code, dunder_attribute\",\n        [(\"a = (); b = a.__class__\", \"__class__\"), (\"class A:\\n    attr=1\\nx = A()\\nx_dict = x.__dict__\", \"__dict__\")],\n    )\n    def test_vulnerability_via_dunder_access(self, code, dunder_attribute):\n        executor = LocalPythonExecutor([])\n        with pytest.raises(InterpreterError, match=f\"Forbidden access to dunder attribute: {dunder_attribute}\"):\n            executor(code)\n\n    def test_vulnerability_via_dunder_indirect_access(self):\n        executor = LocalPythonExecutor([])\n        code = \"a = (); b = getattr(a, '__class__')\"\n        with pytest.raises(InterpreterError, match=\"Forbidden function evaluation: 'getattr'\"):\n            executor(code)\n\n    @pytest.mark.parametrize(\n        \"code, additional_allowed_dunder_methods, expected_dunder_function_error\",\n        [\n            (\"'string'.__dir__()\", [], \"__dir__\"),\n            (\"'string'.__dir__()\", [\"__dir__\"], None),\n            (\n                dedent(\n                    \"\"\"\n                    def func():\n                        pass\n                    function_class = func.__getattribute__(\"__class__\")\n                    object_class = type.__getattribute__(function_class, '__bases__')[0]\n                    for i, subclass in enumerate(object_class.__subclasses__()):\n                        if 'subprocess.Popen' in str(subclass):\n                            break\n                    subclass([\"sh\", \"-c\", \":\"]).wait()\n                    \"\"\"\n                ),\n                [],\n                \"__getattribute__\",\n            ),\n            (\n                dedent(\n                    \"\"\"\n                    def func():\n                        pass\n                    function_class = func.__getattribute__(\"__class__\")\n                    object_class = type.__getattribute__(function_class, '__bases__')[0]\n                    for i, subclass in enumerate(object_class.__subclasses__()):\n                        if 'subprocess.Popen' in str(subclass):\n                            break\n                    subclass([\"sh\", \"-c\", \":\"]).wait()\n                    \"\"\"\n                ),\n                [\"__getattribute__\"],\n                \"__subclasses__\",\n            ),\n            (\n                dedent(\n                    \"\"\"\n                    def func():\n                        pass\n                    function_class = func.__getattribute__(\"__class__\")\n                    object_class = type.__getattribute__(function_class, '__bases__')[0]\n                    for i, subclass in enumerate(object_class.__subclasses__()):\n                        if 'subprocess.Popen' in str(subclass):\n                            break\n                    subclass([\"sh\", \"-c\", \":\"]).wait()\n                    \"\"\"\n                ),\n                [\"__getattribute__\", \"__subclasses__\"],\n                None,\n            ),\n        ],\n    )\n    def test_vulnerability_via_dunder_call(\n        self, code, additional_allowed_dunder_methods, expected_dunder_function_error, monkeypatch\n    ):\n        import smolagents.local_python_executor\n\n        monkeypatch.setattr(\n            \"smolagents.local_python_executor.ALLOWED_DUNDER_METHODS\",\n            smolagents.local_python_executor.ALLOWED_DUNDER_METHODS + additional_allowed_dunder_methods,\n        )\n        executor = LocalPythonExecutor([])\n        executor.send_tools({})\n        expectation = (\n            pytest.raises(\n                InterpreterError, match=f\"Forbidden call to dunder function: {expected_dunder_function_error}\"\n            )\n            if expected_dunder_function_error\n            else does_not_raise()\n        )\n        with expectation:\n            executor(code)\n"
  },
  {
    "path": "tests/test_mcp_client.py",
    "content": "import json\nfrom textwrap import dedent\n\nimport pytest\nfrom mcp import StdioServerParameters\n\nfrom smolagents.mcp_client import MCPClient\n\n\n@pytest.fixture\ndef echo_server_script():\n    return dedent(\n        '''\n        from mcp.server.fastmcp import FastMCP\n\n        mcp = FastMCP(\"Echo Server\")\n\n        @mcp.tool()\n        def echo_tool(text: str) -> str:\n            \"\"\"Echo the input text\"\"\"\n            return f\"Echo: {text}\"\n\n        mcp.run()\n        '''\n    )\n\n\n@pytest.fixture\ndef structured_output_server_script():\n    return dedent(\n        '''\n        from mcp.server.fastmcp import FastMCP\n        from typing import Any\n\n        mcp = FastMCP(\"Structured Output Server\")\n\n        @mcp.tool()\n        def user_info_tool(name: str) -> dict[str, Any]:\n            \"\"\"Get user information as structured data\"\"\"\n            user_data = {\n                \"name\": name,\n                \"age\": 25,\n                \"email\": f\"{name.lower()}@example.com\",\n                \"active\": True\n            }\n            return user_data\n\n        mcp.run()\n        '''\n    )\n\n\n# Ignore FutureWarning about structured_output default value change: this test intentionally uses default behavior\n@pytest.mark.filterwarnings(\"ignore:.*structured_output:FutureWarning\")\ndef test_mcp_client_with_syntax(echo_server_script: str):\n    \"\"\"Test the MCPClient with the context manager syntax.\"\"\"\n    server_parameters = StdioServerParameters(command=\"python\", args=[\"-c\", echo_server_script])\n    with MCPClient(server_parameters) as tools:\n        assert len(tools) == 1\n        assert tools[0].name == \"echo_tool\"\n        assert tools[0].forward(**{\"text\": \"Hello, world!\"}) == \"Echo: Hello, world!\"\n\n\ndef test_mcp_client_with_structured_output(structured_output_server_script: str):\n    \"\"\"Test the MCPClient with structured_output=True parameter.\"\"\"\n    server_parameters = StdioServerParameters(command=\"python\", args=[\"-c\", structured_output_server_script])\n    with MCPClient(server_parameters, structured_output=True) as tools:\n        assert len(tools) == 1\n        assert tools[0].name == \"user_info_tool\"\n        assert tools[0].output_type == \"object\"  # Should be object due to outputSchema\n\n        # Check the output schema {'additionalProperties': True, 'title': 'user_info_toolDictOutput', 'type': 'object'}\n        assert tools[0].output_schema is not None\n        schema = tools[0].output_schema\n        assert isinstance(schema, dict)\n        assert schema.get(\"type\") == \"object\"\n\n        # Test that structured output is properly parsed\n        result = tools[0].forward(**{\"name\": \"Alice\"})\n        assert isinstance(result, dict)\n        assert result[\"name\"] == \"Alice\"\n        assert result[\"age\"] == 25\n        assert result[\"email\"] == \"alice@example.com\"\n        assert result[\"active\"] is True\n\n\ndef test_mcp_client_without_structured_output(structured_output_server_script: str):\n    \"\"\"Test the MCPClient with structured_output=False (default) for comparison.\"\"\"\n    server_parameters = StdioServerParameters(command=\"python\", args=[\"-c\", structured_output_server_script])\n    with MCPClient(server_parameters, structured_output=False) as tools:\n        assert len(tools) == 1\n        assert tools[0].name == \"user_info_tool\"\n        assert tools[0].output_type == \"object\"\n\n        # Test that output is returned as raw text\n        result = tools[0].forward(**{\"name\": \"Alice\"})\n        assert isinstance(result, str)\n        # Should be JSON string, not parsed object\n        parsed_result = json.loads(result)\n        assert parsed_result[\"name\"] == \"Alice\"\n\n\n# Ignore FutureWarning about structured_output default value change: this test intentionally uses default behavior\n@pytest.mark.filterwarnings(\"ignore:.*structured_output:FutureWarning\")\ndef test_mcp_client_try_finally_syntax(echo_server_script: str):\n    \"\"\"Test the MCPClient with the try ... finally syntax.\"\"\"\n    server_parameters = StdioServerParameters(command=\"python\", args=[\"-c\", echo_server_script])\n    mcp_client = MCPClient(server_parameters)\n    try:\n        tools = mcp_client.get_tools()\n        assert len(tools) == 1\n        assert tools[0].name == \"echo_tool\"\n        assert tools[0].forward(**{\"text\": \"Hello, world!\"}) == \"Echo: Hello, world!\"\n    finally:\n        mcp_client.disconnect()\n\n\n# Ignore FutureWarning about structured_output default value change: this test intentionally uses default behavior\n@pytest.mark.filterwarnings(\"ignore:.*structured_output:FutureWarning\")\ndef test_multiple_servers(echo_server_script: str):\n    \"\"\"Test the MCPClient with multiple servers.\"\"\"\n    server_parameters = [\n        StdioServerParameters(command=\"python\", args=[\"-c\", echo_server_script]),\n        StdioServerParameters(command=\"python\", args=[\"-c\", echo_server_script]),\n    ]\n    with MCPClient(server_parameters) as tools:\n        assert len(tools) == 2\n        assert tools[0].name == \"echo_tool\"\n        assert tools[1].name == \"echo_tool\"\n        assert tools[0].forward(**{\"text\": \"Hello, world!\"}) == \"Echo: Hello, world!\"\n        assert tools[1].forward(**{\"text\": \"Hello, world!\"}) == \"Echo: Hello, world!\"\n"
  },
  {
    "path": "tests/test_memory.py",
    "content": "import json\n\nimport pytest\nfrom PIL import Image\n\nfrom smolagents.agents import ToolCall\nfrom smolagents.memory import (\n    ActionStep,\n    AgentMemory,\n    ChatMessage,\n    MemoryStep,\n    MessageRole,\n    PlanningStep,\n    SystemPromptStep,\n    TaskStep,\n)\nfrom smolagents.monitoring import Timing, TokenUsage\n\n\nclass TestAgentMemory:\n    def test_initialization(self):\n        system_prompt = \"This is a system prompt.\"\n        memory = AgentMemory(system_prompt=system_prompt)\n        assert memory.system_prompt.system_prompt == system_prompt\n        assert memory.steps == []\n\n    def test_return_all_code_actions(self):\n        memory = AgentMemory(system_prompt=\"This is a system prompt.\")\n        memory.steps = [\n            ActionStep(step_number=1, timing=Timing(start_time=0.0, end_time=1.0), code_action=\"print('Hello')\"),\n            ActionStep(step_number=2, timing=Timing(start_time=0.0, end_time=1.0), code_action=None),\n            ActionStep(step_number=3, timing=Timing(start_time=0.0, end_time=1.0), code_action=\"print('World')\"),\n        ]  # type: ignore\n        assert memory.return_full_code() == \"print('Hello')\\n\\nprint('World')\"\n\n\nclass TestMemoryStep:\n    def test_initialization(self):\n        step = MemoryStep()\n        assert isinstance(step, MemoryStep)\n\n    def test_dict(self):\n        step = MemoryStep()\n        assert step.dict() == {}\n\n    def test_to_messages(self):\n        step = MemoryStep()\n        with pytest.raises(NotImplementedError):\n            step.to_messages()\n\n\ndef test_action_step_dict():\n    action_step = ActionStep(\n        model_input_messages=[ChatMessage(role=MessageRole.USER, content=\"Hello\")],\n        tool_calls=[\n            ToolCall(id=\"id\", name=\"get_weather\", arguments={\"location\": \"Paris\"}),\n        ],\n        timing=Timing(start_time=0.0, end_time=1.0),\n        step_number=1,\n        error=None,\n        model_output_message=ChatMessage(role=MessageRole.ASSISTANT, content=\"Hi\"),\n        model_output=\"Hi\",\n        observations=\"This is a nice observation\",\n        observations_images=[Image.new(\"RGB\", (100, 100))],\n        action_output=\"Output\",\n        token_usage=TokenUsage(input_tokens=10, output_tokens=20),\n    )\n    action_step_dict = action_step.dict()\n    # Check each key individually for better test failure messages\n    assert \"model_input_messages\" in action_step_dict\n    assert action_step_dict[\"model_input_messages\"] == [\n        {\"role\": MessageRole.USER, \"content\": \"Hello\", \"tool_calls\": None, \"raw\": None, \"token_usage\": None}\n    ]\n\n    assert \"tool_calls\" in action_step_dict\n    assert len(action_step_dict[\"tool_calls\"]) == 1\n    assert action_step_dict[\"tool_calls\"][0] == {\n        \"id\": \"id\",\n        \"type\": \"function\",\n        \"function\": {\n            \"name\": \"get_weather\",\n            \"arguments\": {\"location\": \"Paris\"},\n        },\n    }\n\n    assert \"timing\" in action_step_dict\n    assert action_step_dict[\"timing\"] == {\"start_time\": 0.0, \"end_time\": 1.0, \"duration\": 1.0}\n\n    assert \"token_usage\" in action_step_dict\n    assert action_step_dict[\"token_usage\"] == {\"input_tokens\": 10, \"output_tokens\": 20, \"total_tokens\": 30}\n\n    assert \"step_number\" in action_step_dict\n    assert action_step_dict[\"step_number\"] == 1\n\n    assert \"error\" in action_step_dict\n    assert action_step_dict[\"error\"] is None\n\n    assert \"model_output_message\" in action_step_dict\n    assert action_step_dict[\"model_output_message\"] == {\n        \"role\": \"assistant\",\n        \"content\": \"Hi\",\n        \"tool_calls\": None,\n        \"raw\": None,\n        \"token_usage\": None,\n    }\n\n    assert \"model_output\" in action_step_dict\n    assert action_step_dict[\"model_output\"] == \"Hi\"\n\n    assert \"observations\" in action_step_dict\n    assert action_step_dict[\"observations\"] == \"This is a nice observation\"\n\n    assert \"observations_images\" in action_step_dict\n\n    assert \"action_output\" in action_step_dict\n    assert action_step_dict[\"action_output\"] == \"Output\"\n\n\ndef test_action_step_to_messages():\n    action_step = ActionStep(\n        model_input_messages=[ChatMessage(role=MessageRole.USER, content=\"Hello\")],\n        tool_calls=[\n            ToolCall(id=\"id\", name=\"get_weather\", arguments={\"location\": \"Paris\"}),\n        ],\n        timing=Timing(start_time=0.0, end_time=1.0),\n        step_number=1,\n        error=None,\n        model_output_message=ChatMessage(role=MessageRole.ASSISTANT, content=\"Hi\"),\n        model_output=\"Hi\",\n        observations=\"This is a nice observation\",\n        observations_images=[Image.new(\"RGB\", (100, 100))],\n        action_output=\"Output\",\n        token_usage=TokenUsage(input_tokens=10, output_tokens=20),\n    )\n    messages = action_step.to_messages()\n    assert len(messages) == 4\n    for message in messages:\n        assert isinstance(message, ChatMessage)\n    assistant_message = messages[0]\n    assert assistant_message.role == MessageRole.ASSISTANT\n    assert len(assistant_message.content) == 1\n    assert assistant_message.content[0][\"type\"] == \"text\"\n    assert assistant_message.content[0][\"text\"] == \"Hi\"\n    message = messages[1]\n    assert message.role == MessageRole.TOOL_CALL\n\n    assert len(message.content) == 1\n    assert message.content[0][\"type\"] == \"text\"\n    assert \"Calling tools:\" in message.content[0][\"text\"]\n\n    image_message = messages[2]\n    assert image_message.content[0][\"type\"] == \"image\"  # type: ignore\n\n    observation_message = messages[3]\n    assert observation_message.role == MessageRole.TOOL_RESPONSE\n    assert \"Observation:\\nThis is a nice observation\" in observation_message.content[0][\"text\"]\n\n\ndef test_action_step_to_messages_no_tool_calls_with_observations():\n    action_step = ActionStep(\n        model_input_messages=None,\n        tool_calls=None,\n        timing=Timing(start_time=0.0, end_time=1.0),\n        step_number=1,\n        error=None,\n        model_output_message=None,\n        model_output=None,\n        observations=\"This is an observation.\",\n        observations_images=None,\n        action_output=None,\n        token_usage=TokenUsage(input_tokens=10, output_tokens=20),\n    )\n    messages = action_step.to_messages()\n    assert len(messages) == 1\n    observation_message = messages[0]\n    assert observation_message.role == MessageRole.TOOL_RESPONSE\n    assert \"Observation:\\nThis is an observation.\" in observation_message.content[0][\"text\"]\n\n\ndef test_planning_step_to_messages():\n    planning_step = PlanningStep(\n        model_input_messages=[ChatMessage(role=MessageRole.USER, content=\"Hello\")],\n        model_output_message=ChatMessage(role=MessageRole.ASSISTANT, content=\"Plan\"),\n        plan=\"This is a plan.\",\n        timing=Timing(start_time=0.0, end_time=1.0),\n    )\n    messages = planning_step.to_messages(summary_mode=False)\n    assert len(messages) == 2\n    for message in messages:\n        assert isinstance(message, ChatMessage)\n        assert isinstance(message.content, list)\n        assert len(message.content) == 1\n        for content in message.content:\n            assert isinstance(content, dict)\n            assert \"type\" in content\n            assert \"text\" in content\n    assert messages[0].role == MessageRole.ASSISTANT\n    assert messages[1].role == MessageRole.USER\n\n\ndef test_task_step_to_messages():\n    task_step = TaskStep(task=\"This is a task.\", task_images=[Image.new(\"RGB\", (100, 100))])\n    messages = task_step.to_messages(summary_mode=False)\n    assert len(messages) == 1\n    for message in messages:\n        assert isinstance(message, ChatMessage)\n        assert message.role == MessageRole.USER\n        assert isinstance(message.content, list)\n        assert len(message.content) == 2\n        text_content = message.content[0]\n        assert isinstance(text_content, dict)\n        assert \"type\" in text_content\n        assert \"text\" in text_content\n        for image_content in message.content[1:]:\n            assert isinstance(image_content, dict)\n            assert \"type\" in image_content\n            assert \"image\" in image_content\n\n\ndef test_system_prompt_step_to_messages():\n    system_prompt_step = SystemPromptStep(system_prompt=\"This is a system prompt.\")\n    messages = system_prompt_step.to_messages(summary_mode=False)\n    assert len(messages) == 1\n    for message in messages:\n        assert isinstance(message, ChatMessage)\n        assert message.role == MessageRole.SYSTEM\n        assert isinstance(message.content, list)\n        assert len(message.content) == 1\n        for content in message.content:\n            assert isinstance(content, dict)\n            assert \"type\" in content\n            assert \"text\" in content\n\n\ndef test_memory_step_json_serialization():\n    \"\"\"Test that memory steps can be JSON serialized without raw fields.\"\"\"\n\n    # Create a mock ChatCompletion-like object (this is what was causing the error)\n    class MockChatCompletion:\n        def __init__(self):\n            self.id = \"chatcmpl-test\"\n            self.choices = []\n\n    # Create a ChatMessage with raw field containing the non-serializable object\n    chat_message = ChatMessage(role=MessageRole.ASSISTANT, content=\"Test response\", raw=MockChatCompletion())\n\n    # Test ActionStep serialization\n    action_step = ActionStep(\n        step_number=1,\n        timing=Timing(start_time=123456, end_time=123457),\n        model_output_message=chat_message,\n        model_input_messages=[chat_message],\n    )\n\n    step_dict = action_step.dict()\n    json_str = json.dumps(step_dict)\n    # Raw field should be present but serializable\n    assert \"raw\" in json_str\n    assert \"MockChatCompletion\" in json_str\n\n    # Test PlanningStep serialization\n    planning_step = PlanningStep(\n        model_input_messages=[chat_message],\n        model_output_message=chat_message,\n        plan=\"Test plan\",\n        timing=Timing(start_time=123456, end_time=123457),\n    )\n\n    planning_dict = planning_step.dict()\n    json_str = json.dumps(planning_dict)\n    # Raw field should be present but serializable\n    assert \"raw\" in json_str\n    assert \"MockChatCompletion\" in json_str\n"
  },
  {
    "path": "tests/test_models.py",
    "content": "# coding=utf-8\n# Copyright 2024 HuggingFace Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\nimport json\nimport sys\nfrom contextlib import ExitStack\nfrom unittest.mock import MagicMock, patch\n\nimport pytest\nfrom huggingface_hub import ChatCompletionOutputMessage\n\nfrom smolagents.default_tools import FinalAnswerTool\nfrom smolagents.models import (\n    AmazonBedrockModel,\n    AzureOpenAIModel,\n    ChatMessage,\n    ChatMessageToolCall,\n    InferenceClientModel,\n    LiteLLMModel,\n    LiteLLMRouterModel,\n    MessageRole,\n    MLXModel,\n    Model,\n    OpenAIModel,\n    TransformersModel,\n    get_clean_message_list,\n    get_tool_call_from_text,\n    get_tool_json_schema,\n    parse_json_if_needed,\n    remove_content_after_stop_sequences,\n    supports_stop_parameter,\n)\nfrom smolagents.tools import tool\n\nfrom .utils.markers import require_run_all\n\n\nclass TestModel:\n    def test_prepare_completion_kwargs_parameter_precedence(self):\n        \"\"\"Test that self.kwargs have highest precedence and REMOVE_PARAMETER works correctly\"\"\"\n        from smolagents.models import REMOVE_PARAMETER\n\n        # Test with self.kwargs having highest precedence\n        model = Model(max_tokens=100, temperature=0.5)\n        completion_kwargs = model._prepare_completion_kwargs(\n            messages=[ChatMessage(role=MessageRole.USER, content=[{\"type\": \"text\", \"text\": \"Hello\"}])],\n            max_tokens=50,  # This should be overridden by self.kwargs\n            top_p=0.9,  # This should remain from kwargs\n        )\n\n        # self.kwargs should have highest precedence\n        assert completion_kwargs[\"max_tokens\"] == 100\n        assert completion_kwargs[\"temperature\"] == 0.5\n        assert completion_kwargs[\"top_p\"] == 0.9\n\n        # Test REMOVE_PARAMETER functionality\n        model_with_removal = Model(max_tokens=REMOVE_PARAMETER, temperature=0.7)\n        completion_kwargs = model_with_removal._prepare_completion_kwargs(\n            messages=[ChatMessage(role=MessageRole.USER, content=[{\"type\": \"text\", \"text\": \"Hello\"}])],\n            max_tokens=200,  # This should be removed by REMOVE_PARAMETER\n            top_p=0.8,\n        )\n\n        # max_tokens should be removed, temperature should be set\n        assert \"max_tokens\" not in completion_kwargs\n        assert completion_kwargs[\"temperature\"] == 0.7\n        assert completion_kwargs[\"top_p\"] == 0.8\n\n    def test_agglomerate_stream_deltas(self):\n        from smolagents.models import (\n            ChatMessageStreamDelta,\n            ChatMessageToolCallFunction,\n            ChatMessageToolCallStreamDelta,\n            agglomerate_stream_deltas,\n        )\n        from smolagents.monitoring import TokenUsage\n\n        stream_deltas = [\n            ChatMessageStreamDelta(\n                content=\"Hi\",\n                tool_calls=[\n                    ChatMessageToolCallStreamDelta(\n                        index=0,\n                        type=\"function\",\n                        function=ChatMessageToolCallFunction(arguments=\"\", name=\"web_search\", description=None),\n                    )\n                ],\n                token_usage=None,\n            ),\n            ChatMessageStreamDelta(\n                content=\" everyone\",\n                tool_calls=[\n                    ChatMessageToolCallStreamDelta(\n                        index=0,\n                        type=\"function\",\n                        function=ChatMessageToolCallFunction(arguments=' {\"', name=\"web_search\", description=None),\n                    )\n                ],\n                token_usage=None,\n            ),\n            ChatMessageStreamDelta(\n                content=\", it's\",\n                tool_calls=[\n                    ChatMessageToolCallStreamDelta(\n                        index=0,\n                        type=\"function\",\n                        function=ChatMessageToolCallFunction(\n                            arguments='query\": \"current pope name and date of birth\"}',\n                            name=\"web_search\",\n                            description=None,\n                        ),\n                    )\n                ],\n                token_usage=None,\n            ),\n            ChatMessageStreamDelta(\n                content=\"\",\n                tool_calls=None,\n                token_usage=TokenUsage(input_tokens=1348, output_tokens=24),\n            ),\n        ]\n        agglomerated_stream_delta = agglomerate_stream_deltas(stream_deltas)\n        assert agglomerated_stream_delta.content == \"Hi everyone, it's\"\n        assert (\n            agglomerated_stream_delta.tool_calls[0].function.arguments\n            == ' {\"query\": \"current pope name and date of birth\"}'\n        )\n        assert agglomerated_stream_delta.token_usage.total_tokens == 1372\n\n    @pytest.mark.parametrize(\n        \"model_id, stop_sequences, should_contain_stop\",\n        [\n            (\"regular-model\", [\"stop1\", \"stop2\"], True),  # Regular model should include stop\n            (\"openai/o3\", [\"stop1\", \"stop2\"], False),  # o3 model should not include stop\n            (\"openai/o4-mini\", [\"stop1\", \"stop2\"], False),  # o4-mini model should not include stop\n            (\"something/else/o3\", [\"stop1\", \"stop2\"], False),  # Path ending with o3 should not include stop\n            (\"something/else/o4-mini\", [\"stop1\", \"stop2\"], False),  # Path ending with o4-mini should not include stop\n            (\"o3\", [\"stop1\", \"stop2\"], False),  # Exact o3 model should not include stop\n            (\"o4-mini\", [\"stop1\", \"stop2\"], False),  # Exact o4-mini model should not include stop\n            (\"regular-model\", None, False),  # None stop_sequences should not add stop parameter\n        ],\n    )\n    def test_prepare_completion_kwargs_stop_sequences(self, model_id, stop_sequences, should_contain_stop):\n        model = Model()\n        model.model_id = model_id\n        completion_kwargs = model._prepare_completion_kwargs(\n            messages=[\n                ChatMessage(role=MessageRole.USER, content=[{\"type\": \"text\", \"text\": \"Hello\"}]),\n            ],\n            stop_sequences=stop_sequences,\n        )\n        # Verify that the stop parameter is only included when appropriate\n        if should_contain_stop:\n            assert \"stop\" in completion_kwargs\n            assert completion_kwargs[\"stop\"] == stop_sequences\n        else:\n            assert \"stop\" not in completion_kwargs\n\n    @pytest.mark.parametrize(\n        \"with_tools, tool_choice, expected_result\",\n        [\n            # Default behavior: With tools but no explicit tool_choice, should default to \"required\"\n            (True, ..., {\"has_tool_choice\": True, \"value\": \"required\"}),\n            # Custom value: With tools and explicit tool_choice=\"auto\"\n            (True, \"auto\", {\"has_tool_choice\": True, \"value\": \"auto\"}),\n            # Tool name as string\n            (True, \"valid_tool_function\", {\"has_tool_choice\": True, \"value\": \"valid_tool_function\"}),\n            # Tool choice as dictionary\n            (\n                True,\n                {\"type\": \"function\", \"function\": {\"name\": \"valid_tool_function\"}},\n                {\"has_tool_choice\": True, \"value\": {\"type\": \"function\", \"function\": {\"name\": \"valid_tool_function\"}}},\n            ),\n            # With tools but explicit None tool_choice: should exclude tool_choice\n            (True, None, {\"has_tool_choice\": False, \"value\": None}),\n            # Without tools: tool_choice should never be included\n            (False, \"required\", {\"has_tool_choice\": False, \"value\": None}),\n            (False, \"auto\", {\"has_tool_choice\": False, \"value\": None}),\n            (False, None, {\"has_tool_choice\": False, \"value\": None}),\n            (False, ..., {\"has_tool_choice\": False, \"value\": None}),\n        ],\n    )\n    def test_prepare_completion_kwargs_tool_choice(self, with_tools, tool_choice, expected_result, example_tool):\n        model = Model()\n        kwargs = {\"messages\": [ChatMessage(role=MessageRole.USER, content=[{\"type\": \"text\", \"text\": \"Hello\"}])]}\n        if with_tools:\n            kwargs[\"tools_to_call_from\"] = [example_tool]\n        if tool_choice is not ...:\n            kwargs[\"tool_choice\"] = tool_choice\n\n        completion_kwargs = model._prepare_completion_kwargs(**kwargs)\n\n        if expected_result[\"has_tool_choice\"]:\n            assert \"tool_choice\" in completion_kwargs\n            assert completion_kwargs[\"tool_choice\"] == expected_result[\"value\"]\n        else:\n            assert \"tool_choice\" not in completion_kwargs\n\n    def test_get_json_schema_has_nullable_args(self):\n        @tool\n        def get_weather(location: str, celsius: bool | None = False) -> str:\n            \"\"\"\n            Get weather in the next days at given location.\n            Secretly this tool does not care about the location, it hates the weather everywhere.\n\n            Args:\n                location: the location\n                celsius: the temperature type\n            \"\"\"\n            return \"The weather is UNGODLY with torrential rains and temperatures below -10°C\"\n\n        assert \"nullable\" in get_tool_json_schema(get_weather)[\"function\"][\"parameters\"][\"properties\"][\"celsius\"]\n\n    def test_chatmessage_has_model_dumps_json(self):\n        message = ChatMessage(\"user\", [{\"type\": \"text\", \"text\": \"Hello!\"}])\n        data = json.loads(message.model_dump_json())\n        assert data[\"content\"] == [{\"type\": \"text\", \"text\": \"Hello!\"}]\n\n    def test_chatmessage_from_dict_role_conversion(self):\n        message_data = {\n            \"role\": \"user\",\n            \"content\": [{\"type\": \"text\", \"text\": \"Hello!\"}],\n        }\n        message = ChatMessage.from_dict(message_data)\n        assert isinstance(message.role, MessageRole)\n        assert message.role == MessageRole.USER\n        assert message.role.value == \"user\"\n        assert message.content == [{\"type\": \"text\", \"text\": \"Hello!\"}]\n\n        message_data[\"role\"] = MessageRole.ASSISTANT\n        message2 = ChatMessage.from_dict(message_data)\n        assert isinstance(message2.role, MessageRole)\n        assert message2.role == MessageRole.ASSISTANT\n\n    @pytest.mark.skipif(not sys.platform.startswith(\"darwin\"), reason=\"requires macOS\")\n    def test_get_mlx_message_no_tool(self):\n        model = MLXModel(model_id=\"HuggingFaceTB/SmolLM2-135M-Instruct\", max_tokens=10)\n        messages = [ChatMessage(role=MessageRole.USER, content=[{\"type\": \"text\", \"text\": \"Hello!\"}])]\n        output = model(messages, stop_sequences=[\"great\"]).content\n        assert output.startswith(\"Hello\")\n\n    @pytest.mark.skipif(not sys.platform.startswith(\"darwin\"), reason=\"requires macOS\")\n    def test_get_mlx_message_tricky_stop_sequence(self):\n        # In this test HuggingFaceTB/SmolLM2-135M-Instruct generates the token \">'\"\n        # which is required to test capturing stop_sequences that have extra chars at the end.\n        model = MLXModel(model_id=\"HuggingFaceTB/SmolLM2-135M-Instruct\", max_tokens=100)\n        stop_sequence = \" print '>\"\n        messages = [\n            ChatMessage(role=MessageRole.USER, content=[{\"type\": \"text\", \"text\": f\"Please{stop_sequence}'\"}]),\n        ]\n        # check our assumption that that \">\" is followed by \"'\"\n        assert model.tokenizer.vocab[\">'\"]\n        assert model(messages, stop_sequences=[]).content == f\"I'm ready to help you{stop_sequence}'\"\n        # check stop_sequence capture when output has trailing chars\n        assert model(messages, stop_sequences=[stop_sequence]).content == \"I'm ready to help you\"\n\n    def test_transformers_message_no_tool(self, monkeypatch):\n        monkeypatch.setattr(\"huggingface_hub.constants.HF_HUB_DOWNLOAD_TIMEOUT\", 30)  # instead of 10\n        model = TransformersModel(\n            model_id=\"HuggingFaceTB/SmolLM2-135M-Instruct\",\n            max_new_tokens=5,\n            device_map=\"cpu\",\n            do_sample=False,\n        )\n        messages = [ChatMessage(role=MessageRole.USER, content=[{\"type\": \"text\", \"text\": \"Hello!\"}])]\n        output = model.generate(messages).content\n        assert output == \"Hello! I'm here\"\n\n        output = model.generate_stream(messages, stop_sequences=[\"great\"])\n        output_str = \"\"\n        for el in output:\n            output_str += el.content\n        assert output_str == \"Hello! I'm here\"\n\n    def test_transformers_message_vl_no_tool(self, shared_datadir, monkeypatch):\n        monkeypatch.setattr(\"huggingface_hub.constants.HF_HUB_DOWNLOAD_TIMEOUT\", 30)  # instead of 10\n        import PIL.Image\n\n        img = PIL.Image.open(shared_datadir / \"000000039769.png\")\n        model = TransformersModel(\n            model_id=\"llava-hf/llava-interleave-qwen-0.5b-hf\",\n            max_new_tokens=4,\n            device_map=\"cpu\",\n            do_sample=False,\n        )\n        messages = [\n            ChatMessage(\n                role=MessageRole.USER,\n                content=[{\"type\": \"text\", \"text\": \"What is this?\"}, {\"type\": \"image\", \"image\": img}],\n            )\n        ]\n        output = model.generate(messages).content\n        assert output == \"I am a very\"  # TODO: Investigate possible regression; see #1416\n\n        output = model.generate_stream(messages, stop_sequences=[\"great\"])\n        output_str = \"\"\n        for el in output:\n            output_str += el.content\n        assert output_str == \"I am a very\"  # TODO: Investigate possible regression; see #1416\n\n    def test_parse_json_if_needed(self):\n        args = \"abc\"\n        parsed_args = parse_json_if_needed(args)\n        assert parsed_args == \"abc\"\n\n        args = '{\"a\": 3}'\n        parsed_args = parse_json_if_needed(args)\n        assert parsed_args == {\"a\": 3}\n\n        args = \"3\"\n        parsed_args = parse_json_if_needed(args)\n        assert parsed_args == 3\n\n        args = 3\n        parsed_args = parse_json_if_needed(args)\n        assert parsed_args == 3\n\n\nclass TestInferenceClientModel:\n    def test_call_with_custom_role_conversions(self):\n        custom_role_conversions = {MessageRole.USER: MessageRole.SYSTEM}\n        model = InferenceClientModel(model_id=\"test-model\", custom_role_conversions=custom_role_conversions)\n        model.client = MagicMock()\n        mock_response = model.client.chat_completion.return_value\n        mock_response.choices[0].message = ChatCompletionOutputMessage(role=MessageRole.ASSISTANT)\n        messages = [ChatMessage(role=MessageRole.USER, content=\"Test message\")]\n        _ = model(messages)\n        # Verify that the role conversion was applied\n        assert model.client.chat_completion.call_args.kwargs[\"messages\"][0][\"role\"] == \"system\", (\n            \"role conversion should be applied\"\n        )\n\n    def test_init_model_with_tokens(self):\n        model = InferenceClientModel(model_id=\"test-model\", token=\"abc\")\n        assert model.client.token == \"abc\"\n\n        model = InferenceClientModel(model_id=\"test-model\", api_key=\"abc\")\n        assert model.client.token == \"abc\"\n\n        with pytest.raises(ValueError, match=\"Received both `token` and `api_key` arguments.\"):\n            InferenceClientModel(model_id=\"test-model\", token=\"abc\", api_key=\"def\")\n\n    def test_structured_outputs_with_unsupported_provider(self):\n        with pytest.raises(\n            ValueError, match=\"InferenceClientModel only supports structured outputs with these providers:\"\n        ):\n            model = InferenceClientModel(model_id=\"test-model\", token=\"abc\", provider=\"some_provider\")\n            model.generate(\n                messages=[ChatMessage(role=MessageRole.USER, content=\"Hello!\")],\n                response_format={\"type\": \"json_object\"},\n            )\n\n    @require_run_all\n    def test_get_hfapi_message_no_tool(self):\n        model = InferenceClientModel(model_id=\"Qwen/Qwen2.5-Coder-32B-Instruct\", max_tokens=10)\n        messages = [ChatMessage(role=MessageRole.USER, content=[{\"type\": \"text\", \"text\": \"Hello!\"}])]\n        model(messages, stop_sequences=[\"great\"])\n\n    @require_run_all\n    def test_get_hfapi_message_no_tool_external_provider(self):\n        model = InferenceClientModel(model_id=\"Qwen/Qwen2.5-Coder-32B-Instruct\", provider=\"together\", max_tokens=10)\n        messages = [ChatMessage(role=MessageRole.USER, content=[{\"type\": \"text\", \"text\": \"Hello!\"}])]\n        model(messages, stop_sequences=[\"great\"])\n\n    @require_run_all\n    def test_get_hfapi_message_stream_no_tool(self):\n        model = InferenceClientModel(model_id=\"Qwen/Qwen2.5-Coder-32B-Instruct\", max_tokens=10)\n        messages = [ChatMessage(role=MessageRole.USER, content=[{\"type\": \"text\", \"text\": \"Hello!\"}])]\n        for el in model.generate_stream(messages, stop_sequences=[\"great\"]):\n            assert el.content is not None\n\n    @require_run_all\n    def test_get_hfapi_message_stream_no_tool_external_provider(self):\n        model = InferenceClientModel(model_id=\"Qwen/Qwen2.5-Coder-32B-Instruct\", provider=\"together\", max_tokens=10)\n        messages = [ChatMessage(role=MessageRole.USER, content=[{\"type\": \"text\", \"text\": \"Hello!\"}])]\n        for el in model.generate_stream(messages, stop_sequences=[\"great\"]):\n            assert el.content is not None\n\n\nclass TestLiteLLMModel:\n    @pytest.mark.parametrize(\n        \"model_id\",\n        [\n            \"groq/llama-3.3-70b\",\n            \"cerebras/llama-3.3-70b\",\n            \"mistral/mistral-tiny\",\n        ],\n    )\n    def test_call_different_providers_without_key(self, model_id):\n        # Different litellm versions produce different error messages for missing API keys\n        # This test checks for the presence of any common authentication-related error phrases\n        possible_error_messages = [\n            \"Missing API Key\",\n            \"Wrong API Key\",\n            \"Invalid API Key\",\n            \"The api_key client option must be set\",\n            \"AuthenticationError\",\n            \"Unauthorized\",\n        ]\n        model = LiteLLMModel(model_id=model_id)\n        messages = [ChatMessage(role=MessageRole.USER, content=[{\"type\": \"text\", \"text\": \"Test message\"}])]\n        # Test generate method\n        with pytest.raises(Exception) as e:\n            model.generate(messages)\n        error_message = str(e)\n        assert any(possible_error_message in error_message for possible_error_message in possible_error_messages), (\n            f\"Error message '{error_message}' does not contain any expected phrases\"\n        )\n        # Test generate_stream method\n        with pytest.raises(Exception) as e:\n            for el in model.generate_stream(messages):\n                assert el.content is not None\n        error_message = str(e)\n        assert any(possible_error_message in error_message for possible_error_message in possible_error_messages), (\n            f\"Error message '{error_message}' does not contain any expected phrases\"\n        )\n\n    def test_retry_on_rate_limit_error(self):\n        \"\"\"Test that the retry mechanism does trigger on 429 rate limit errors\"\"\"\n        import time\n\n        # Patch RETRY_WAIT to 1 second for faster testing\n        mock_litellm = MagicMock()\n\n        with (\n            patch(\"smolagents.models.RETRY_WAIT\", 0.1),\n            patch(\"smolagents.utils.random.random\", side_effect=[0.1, 0.1]),\n            patch(\"smolagents.models.LiteLLMModel.create_client\", return_value=mock_litellm),\n        ):\n            model = LiteLLMModel(model_id=\"test-model\")\n            messages = [ChatMessage(role=MessageRole.USER, content=[{\"type\": \"text\", \"text\": \"Test message\"}])]\n\n            # Create a mock response for successful call\n            mock_success_response = MagicMock()\n            mock_success_response.choices = [MagicMock()]\n            # Set content directly (not through model_dump)\n            mock_success_response.choices[0].message.content = \"Success response\"\n            mock_success_response.choices[0].message.role = \"assistant\"\n            mock_success_response.choices[0].message.tool_calls = None\n            mock_success_response.usage.prompt_tokens = 10\n            mock_success_response.usage.completion_tokens = 20\n\n            # Create a 429 rate limit error\n            rate_limit_error = Exception(\"Error code: 429 - Rate limit exceeded\")\n\n            # Mock the litellm client to raise an error twice, and then succeed\n            model.client.completion.side_effect = [rate_limit_error, rate_limit_error, mock_success_response]\n\n            # Measure time to verify retry wait time\n            start_time = time.time()\n            result = model.generate(messages)\n            elapsed_time = time.time() - start_time\n\n            # Verify that completion was called thrice (twice failed, once succeeded)\n            assert model.client.completion.call_count == 3\n            assert result.content == \"Success response\"\n            assert result.token_usage.input_tokens == 10\n            assert result.token_usage.output_tokens == 20\n\n            # Verify that the wait time was around\n            # 0.22s (1st retry) [0.1 * 2.0 * (1 + 1 * 0.1)]\n            # + 0.48s (2nd retry) [0.22 * 2.0 * (1 + 1 * 0.1)]\n            # = 0.704s (allow some tolerance)\n            assert 0.67 <= elapsed_time <= 0.73\n\n    def test_passing_flatten_messages(self):\n        model = LiteLLMModel(model_id=\"groq/llama-3.3-70b\", flatten_messages_as_text=False)\n        assert not model.flatten_messages_as_text\n\n        model = LiteLLMModel(model_id=\"fal/llama-3.3-70b\", flatten_messages_as_text=True)\n        assert model.flatten_messages_as_text\n\n\nclass TestLiteLLMRouterModel:\n    @pytest.mark.parametrize(\n        \"model_id, expected\",\n        [\n            (\"llama-3.3-70b\", False),\n            (\"llama-3.3-70b\", True),\n            (\"mistral-tiny\", True),\n        ],\n    )\n    def test_flatten_messages_as_text(self, model_id, expected):\n        model_list = [\n            {\"model_name\": \"llama-3.3-70b\", \"litellm_params\": {\"model\": \"groq/llama-3.3-70b\"}},\n            {\"model_name\": \"llama-3.3-70b\", \"litellm_params\": {\"model\": \"cerebras/llama-3.3-70b\"}},\n            {\"model_name\": \"mistral-tiny\", \"litellm_params\": {\"model\": \"mistral/mistral-tiny\"}},\n        ]\n        model = LiteLLMRouterModel(model_id=model_id, model_list=model_list, flatten_messages_as_text=expected)\n        assert model.flatten_messages_as_text is expected\n\n    def test_create_client(self):\n        model_list = [\n            {\"model_name\": \"llama-3.3-70b\", \"litellm_params\": {\"model\": \"groq/llama-3.3-70b\"}},\n            {\"model_name\": \"llama-3.3-70b\", \"litellm_params\": {\"model\": \"cerebras/llama-3.3-70b\"}},\n        ]\n        with patch(\"litellm.router.Router\") as mock_router:\n            router_model = LiteLLMRouterModel(\n                model_id=\"model-group-1\", model_list=model_list, client_kwargs={\"routing_strategy\": \"simple-shuffle\"}\n            )\n            # Ensure that the Router constructor was called with the expected keyword arguments\n            mock_router.assert_called_once()\n            assert mock_router.call_count == 1\n            assert mock_router.call_args.kwargs[\"model_list\"] == model_list\n            assert mock_router.call_args.kwargs[\"routing_strategy\"] == \"simple-shuffle\"\n            assert router_model.client == mock_router.return_value\n\n\nclass TestOpenAIModel:\n    def test_client_kwargs_passed_correctly(self):\n        model_id = \"gpt-3.5-turbo\"\n        api_base = \"https://api.openai.com/v1\"\n        api_key = \"test_api_key\"\n        organization = \"test_org\"\n        project = \"test_project\"\n        client_kwargs = {\"max_retries\": 5}\n\n        with patch(\"openai.OpenAI\") as MockOpenAI:\n            model = OpenAIModel(\n                model_id=model_id,\n                api_base=api_base,\n                api_key=api_key,\n                organization=organization,\n                project=project,\n                client_kwargs=client_kwargs,\n            )\n        MockOpenAI.assert_called_once_with(\n            base_url=api_base, api_key=api_key, organization=organization, project=project, max_retries=5\n        )\n        assert model.client == MockOpenAI.return_value\n\n    @require_run_all\n    def test_streaming_tool_calls(self):\n        model = OpenAIModel(model_id=\"gpt-4o-mini\")\n        messages = [\n            ChatMessage(\n                role=MessageRole.USER,\n                content=[\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Hello! Please return the final answer 'blob' and the final answer 'blob2' in two parallel tool calls\",\n                    }\n                ],\n            ),\n        ]\n        for el in model.generate_stream(messages, tools_to_call_from=[FinalAnswerTool()]):\n            if el.tool_calls:\n                assert el.tool_calls[0].function.name == \"final_answer\"\n                args = el.tool_calls[0].function.arguments\n                if len(el.tool_calls) > 1:\n                    assert el.tool_calls[1].function.name == \"final_answer\"\n                    args2 = el.tool_calls[1].function.arguments\n        assert args == '{\"answer\": \"blob\"}'\n        assert args2 == '{\"answer\": \"blob2\"}'\n\n    def test_stop_sequence_cutting_for_o4_mini(self):\n        \"\"\"Test that stop sequences are cut a posteriori for models that don't support stop parameter\"\"\"\n        # Create a mock response that contains a stop sequence in the middle\n        mock_response = MagicMock()\n        mock_response.choices = [MagicMock()]\n        mock_response.choices[0].message.role = \"assistant\"\n        mock_response.choices[0].message.content = \"This is some text<STOP>and this should be removed\"\n        mock_response.choices[0].message.tool_calls = None\n        mock_response.usage.prompt_tokens = 10\n        mock_response.usage.completion_tokens = 20\n\n        with patch(\"openai.OpenAI\") as MockOpenAI:\n            mock_client = MagicMock()\n            MockOpenAI.return_value = mock_client\n            mock_client.chat.completions.create.return_value = mock_response\n\n            model = OpenAIModel(model_id=\"o4-mini\")\n            messages = [ChatMessage(role=MessageRole.USER, content=[{\"type\": \"text\", \"text\": \"Hello\"}])]\n            result = model.generate(messages, stop_sequences=[\"<STOP>\"])\n\n            # Verify the stop sequence was removed\n            assert result.content == \"This is some text\"\n            assert \"<STOP>\" not in result.content\n            assert \"and this should be removed\" not in result.content\n\n\nclass TestAmazonBedrockModel:\n    def test_client_for_bedrock(self):\n        model_id = \"us.amazon.nova-pro-v1:0\"\n\n        with patch(\"boto3.client\") as MockBoto3:\n            model = AmazonBedrockModel(\n                model_id=model_id,\n            )\n\n        assert model.client == MockBoto3.return_value\n\n\nclass TestAzureOpenAIModel:\n    def test_client_kwargs_passed_correctly(self):\n        model_id = \"gpt-3.5-turbo\"\n        api_key = \"test_api_key\"\n        api_version = \"2023-12-01-preview\"\n        azure_endpoint = \"https://example-resource.azure.openai.com/\"\n        organization = \"test_org\"\n        project = \"test_project\"\n        client_kwargs = {\"max_retries\": 5}\n\n        with patch(\"openai.OpenAI\") as MockOpenAI, patch(\"openai.AzureOpenAI\") as MockAzureOpenAI:\n            model = AzureOpenAIModel(\n                model_id=model_id,\n                api_key=api_key,\n                api_version=api_version,\n                azure_endpoint=azure_endpoint,\n                organization=organization,\n                project=project,\n                client_kwargs=client_kwargs,\n            )\n        assert MockOpenAI.call_count == 0\n        MockAzureOpenAI.assert_called_once_with(\n            base_url=None,\n            api_key=api_key,\n            api_version=api_version,\n            azure_endpoint=azure_endpoint,\n            organization=organization,\n            project=project,\n            max_retries=5,\n        )\n        assert model.client == MockAzureOpenAI.return_value\n\n\nclass TestTransformersModel:\n    @pytest.mark.parametrize(\n        \"patching\",\n        [\n            [\n                (\n                    \"transformers.AutoModelForImageTextToText.from_pretrained\",\n                    {\"side_effect\": ValueError(\"Unrecognized configuration class\")},\n                ),\n                (\"transformers.AutoModelForCausalLM.from_pretrained\", {}),\n                (\"transformers.AutoTokenizer.from_pretrained\", {}),\n            ],\n            [\n                (\"transformers.AutoModelForImageTextToText.from_pretrained\", {}),\n                (\"transformers.AutoProcessor.from_pretrained\", {}),\n            ],\n        ],\n    )\n    def test_init(self, patching):\n        with ExitStack() as stack:\n            mocks = {target: stack.enter_context(patch(target, **kwargs)) for target, kwargs in patching}\n            model = TransformersModel(\n                model_id=\"test-model\", device_map=\"cpu\", torch_dtype=\"float16\", trust_remote_code=True\n            )\n        assert model.model_id == \"test-model\"\n        if \"transformers.AutoTokenizer.from_pretrained\" in mocks:\n            assert model.model == mocks[\"transformers.AutoModelForCausalLM.from_pretrained\"].return_value\n            assert mocks[\"transformers.AutoModelForCausalLM.from_pretrained\"].call_args.kwargs == {\n                \"device_map\": \"cpu\",\n                \"torch_dtype\": \"float16\",\n                \"trust_remote_code\": True,\n            }\n            assert model.tokenizer == mocks[\"transformers.AutoTokenizer.from_pretrained\"].return_value\n            assert mocks[\"transformers.AutoTokenizer.from_pretrained\"].call_args.args == (\"test-model\",)\n            assert mocks[\"transformers.AutoTokenizer.from_pretrained\"].call_args.kwargs == {\"trust_remote_code\": True}\n        elif \"transformers.AutoProcessor.from_pretrained\" in mocks:\n            assert model.model == mocks[\"transformers.AutoModelForImageTextToText.from_pretrained\"].return_value\n            assert mocks[\"transformers.AutoModelForImageTextToText.from_pretrained\"].call_args.kwargs == {\n                \"device_map\": \"cpu\",\n                \"torch_dtype\": \"float16\",\n                \"trust_remote_code\": True,\n            }\n            assert model.processor == mocks[\"transformers.AutoProcessor.from_pretrained\"].return_value\n            assert mocks[\"transformers.AutoProcessor.from_pretrained\"].call_args.args == (\"test-model\",)\n            assert mocks[\"transformers.AutoProcessor.from_pretrained\"].call_args.kwargs == {\"trust_remote_code\": True}\n\n\ndef test_get_clean_message_list_basic():\n    messages = [\n        ChatMessage(role=MessageRole.USER, content=[{\"type\": \"text\", \"text\": \"Hello!\"}]),\n        ChatMessage(role=MessageRole.ASSISTANT, content=[{\"type\": \"text\", \"text\": \"Hi there!\"}]),\n    ]\n    result = get_clean_message_list(messages)\n    assert len(result) == 2\n    assert result[0][\"role\"] == \"user\"\n    assert result[0][\"content\"][0][\"text\"] == \"Hello!\"\n    assert result[1][\"role\"] == \"assistant\"\n    assert result[1][\"content\"][0][\"text\"] == \"Hi there!\"\n\n\n@pytest.mark.parametrize(\n    \"messages,expected_roles,expected_texts\",\n    [\n        (\n            [\n                {\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"Hello!\"}]},\n                {\"role\": \"assistant\", \"content\": [{\"type\": \"text\", \"text\": \"Hi there!\"}]},\n            ],\n            [\"user\", \"assistant\"],\n            [\"Hello!\", \"Hi there!\"],\n        ),\n        (\n            [\n                {\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"How are you?\"}]},\n            ],\n            [\"user\"],\n            [\"How are you?\"],\n        ),\n    ],\n)\ndef test_get_clean_message_list_with_dicts(messages, expected_roles, expected_texts):\n    result = get_clean_message_list(messages)\n    assert len(result) == len(messages)\n    for i, msg in enumerate(result):\n        assert msg[\"role\"] == expected_roles[i]\n        assert msg[\"content\"][0][\"text\"] == expected_texts[i]\n\n\ndef test_get_clean_message_list_role_conversions():\n    messages = [\n        ChatMessage(role=MessageRole.TOOL_CALL, content=[{\"type\": \"text\", \"text\": \"Calling tool...\"}]),\n        ChatMessage(role=MessageRole.TOOL_RESPONSE, content=[{\"type\": \"text\", \"text\": \"Tool response\"}]),\n    ]\n    result = get_clean_message_list(messages, role_conversions={\"tool-call\": \"assistant\", \"tool-response\": \"user\"})\n    assert len(result) == 2\n    assert result[0][\"role\"] == \"assistant\"\n    assert result[0][\"content\"][0][\"text\"] == \"Calling tool...\"\n    assert result[1][\"role\"] == \"user\"\n    assert result[1][\"content\"][0][\"text\"] == \"Tool response\"\n\n\ndef test_remove_content_after_stop_sequences():\n    content = \"Hello<code>world!\"\n    stop_sequences = [\"<code>\"]\n    removed_content = remove_content_after_stop_sequences(content, stop_sequences)\n    assert removed_content == \"Hello\"\n\n\ndef test_remove_content_after_stop_sequences_handles_none():\n    # Test with None stop sequence\n    content = \"Hello world!\"\n    removed_content = remove_content_after_stop_sequences(content, None)\n    assert removed_content == content\n\n    # Test with None content\n    removed_content = remove_content_after_stop_sequences(None, [\"<code>\"])\n    assert removed_content is None\n\n\n@pytest.mark.parametrize(\n    \"convert_images_to_image_urls, expected_clean_message\",\n    [\n        (\n            False,\n            dict(\n                role=MessageRole.USER,\n                content=[\n                    {\"type\": \"image\", \"image\": \"encoded_image\"},\n                    {\"type\": \"image\", \"image\": \"second_encoded_image\"},\n                ],\n            ),\n        ),\n        (\n            True,\n            dict(\n                role=MessageRole.USER,\n                content=[\n                    {\"type\": \"image_url\", \"image_url\": {\"url\": \"data:image/png;base64,encoded_image\"}},\n                    {\"type\": \"image_url\", \"image_url\": {\"url\": \"data:image/png;base64,second_encoded_image\"}},\n                ],\n            ),\n        ),\n    ],\n)\ndef test_get_clean_message_list_image_encoding(convert_images_to_image_urls, expected_clean_message):\n    message = ChatMessage(\n        role=MessageRole.USER,\n        content=[{\"type\": \"image\", \"image\": b\"image_data\"}, {\"type\": \"image\", \"image\": b\"second_image_data\"}],\n    )\n    with patch(\"smolagents.models.encode_image_base64\") as mock_encode:\n        mock_encode.side_effect = [\"encoded_image\", \"second_encoded_image\"]\n        result = get_clean_message_list([message], convert_images_to_image_urls=convert_images_to_image_urls)\n        mock_encode.assert_any_call(b\"image_data\")\n        mock_encode.assert_any_call(b\"second_image_data\")\n        assert len(result) == 1\n        assert result[0] == expected_clean_message\n\n\ndef test_get_clean_message_list_flatten_messages_as_text():\n    messages = [\n        ChatMessage(role=MessageRole.USER, content=[{\"type\": \"text\", \"text\": \"Hello!\"}]),\n        ChatMessage(role=MessageRole.USER, content=[{\"type\": \"text\", \"text\": \"How are you?\"}]),\n    ]\n    result = get_clean_message_list(messages, flatten_messages_as_text=True)\n    assert len(result) == 1\n    assert result[0][\"role\"] == \"user\"\n    assert result[0][\"content\"] == \"Hello!\\nHow are you?\"\n\n\n@pytest.mark.parametrize(\n    \"model_class, model_kwargs, patching, expected_flatten_messages_as_text\",\n    [\n        (AzureOpenAIModel, {}, (\"openai.AzureOpenAI\", {}), False),\n        (InferenceClientModel, {}, (\"huggingface_hub.InferenceClient\", {}), False),\n        (LiteLLMModel, {}, None, False),\n        (LiteLLMModel, {\"model_id\": \"ollama\"}, None, True),\n        (LiteLLMModel, {\"model_id\": \"groq\"}, None, True),\n        (LiteLLMModel, {\"model_id\": \"cerebras\"}, None, True),\n        (MLXModel, {}, (\"mlx_lm.load\", {\"return_value\": (MagicMock(), MagicMock())}), True),\n        (OpenAIModel, {}, (\"openai.OpenAI\", {}), False),\n        (OpenAIModel, {\"flatten_messages_as_text\": True}, (\"openai.OpenAI\", {}), True),\n        (\n            TransformersModel,\n            {},\n            [\n                (\n                    \"transformers.AutoModelForImageTextToText.from_pretrained\",\n                    {\"side_effect\": ValueError(\"Unrecognized configuration class\")},\n                ),\n                (\"transformers.AutoModelForCausalLM.from_pretrained\", {}),\n                (\"transformers.AutoTokenizer.from_pretrained\", {}),\n            ],\n            True,\n        ),\n        (\n            TransformersModel,\n            {},\n            [\n                (\"transformers.AutoModelForImageTextToText.from_pretrained\", {}),\n                (\"transformers.AutoProcessor.from_pretrained\", {}),\n            ],\n            False,\n        ),\n    ],\n)\ndef test_flatten_messages_as_text_for_all_models(\n    model_class, model_kwargs, patching, expected_flatten_messages_as_text\n):\n    with ExitStack() as stack:\n        if isinstance(patching, list):\n            for target, kwargs in patching:\n                stack.enter_context(patch(target, **kwargs))\n        elif patching:\n            target, kwargs = patching\n            stack.enter_context(patch(target, **kwargs))\n\n        model = model_class(**{\"model_id\": \"test-model\", **model_kwargs})\n    assert model.flatten_messages_as_text is expected_flatten_messages_as_text, f\"{model_class.__name__} failed\"\n\n\n@pytest.mark.parametrize(\n    \"model_id,expected\",\n    [\n        # Unsupported base models\n        (\"o3\", False),\n        (\"o4-mini\", False),\n        (\"gpt-5.1\", False),\n        (\"gpt-5.2\", False),\n        (\"gpt-5\", False),\n        (\"gpt-5-mini\", False),\n        (\"gpt-5-nano\", False),\n        (\"gpt-5-turbo\", False),\n        (\"gpt-5.2-mini\", False),\n        (\"grok-4\", False),\n        (\"grok-4-latest\", False),\n        (\"grok-4.1\", False),\n        (\"grok-3\", False),\n        (\"grok-3-mini\", False),\n        (\"grok-code-fast-1\", False),\n        # Unsupported versioned models\n        (\"o4-mini-2025-04-16\", False),\n        (\"gpt-5-2025-01-01\", False),\n        # Unsupported models with path prefixes\n        (\"openai/o3\", False),\n        (\"openai/o4-mini\", False),\n        (\"openai/o3-2025-04-16\", False),\n        (\"openai/o4-mini-2025-04-16\", False),\n        (\"openai/gpt-5.2\", False),\n        (\"openai/gpt-5.2-mini\", False),\n        (\"openai/gpt-5.2-2025-01-01\", False),\n        (\"oci/xai.grok-4\", False),\n        (\"oci/xai.grok-3-mini\", False),\n        # Supported models\n        (\"o3-mini\", True),\n        (\"gpt-4\", True),\n        (\"claude-3-5-sonnet\", True),\n        (\"mistral-large\", True),\n        # Supported models with path prefixes\n        (\"openai/gpt-4\", True),\n        (\"anthropic/claude-3-5-sonnet\", True),\n        (\"anthropic/claude-opus-4-5\", True),\n        (\"mistralai/mistral-large\", True),\n        # Edge cases\n        (\"\", True),  # Empty string doesn't match pattern\n        (\"o3x\", True),  # Not exactly o3\n        (\"o4x\", True),  # Not exactly o4\n        (\"gpt-5x\", False),\n        (\"gpt-50\", False),\n        (\"o3_mini\", True),  # Not o3-mini format\n        (\"prefix-o3\", True),  # o3 not at start\n    ],\n)\ndef test_supports_stop_parameter(model_id, expected):\n    \"\"\"Test the supports_stop_parameter function with various model IDs\"\"\"\n    assert supports_stop_parameter(model_id) == expected, f\"Failed for model_id: {model_id}\"\n\n\nclass TestGetToolCallFromText:\n    @pytest.fixture(autouse=True)\n    def mock_uuid4(self):\n        with patch(\"uuid.uuid4\", return_value=\"test-uuid\"):\n            yield\n\n    def test_get_tool_call_from_text_basic(self):\n        text = '{\"name\": \"weather_tool\", \"arguments\": \"New York\"}'\n        result = get_tool_call_from_text(text, \"name\", \"arguments\")\n        assert isinstance(result, ChatMessageToolCall)\n        assert result.id == \"test-uuid\"\n        assert result.type == \"function\"\n        assert result.function.name == \"weather_tool\"\n        assert result.function.arguments == \"New York\"\n\n    def test_get_tool_call_from_text_name_key_missing(self):\n        text = '{\"action\": \"weather_tool\", \"arguments\": \"New York\"}'\n        with pytest.raises(ValueError) as exc_info:\n            get_tool_call_from_text(text, \"name\", \"arguments\")\n        error_msg = str(exc_info.value)\n        assert \"Tool call needs to have a key 'name'\" in error_msg\n        assert \"'action', 'arguments'\" in error_msg\n\n    def test_get_tool_call_from_text_json_object_args(self):\n        text = '{\"name\": \"weather_tool\", \"arguments\": {\"city\": \"New York\"}}'\n        result = get_tool_call_from_text(text, \"name\", \"arguments\")\n        assert result.function.arguments == {\"city\": \"New York\"}\n\n    def test_get_tool_call_from_text_json_string_args(self):\n        text = '{\"name\": \"weather_tool\", \"arguments\": \"{\\\\\"city\\\\\": \\\\\"New York\\\\\"}\"}'\n        result = get_tool_call_from_text(text, \"name\", \"arguments\")\n        assert result.function.arguments == {\"city\": \"New York\"}\n\n    def test_get_tool_call_from_text_missing_args(self):\n        text = '{\"name\": \"weather_tool\"}'\n        result = get_tool_call_from_text(text, \"name\", \"arguments\")\n        assert result.function.arguments is None\n\n    def test_get_tool_call_from_text_custom_keys(self):\n        text = '{\"tool\": \"weather_tool\", \"params\": \"New York\"}'\n        result = get_tool_call_from_text(text, \"tool\", \"params\")\n        assert result.function.name == \"weather_tool\"\n        assert result.function.arguments == \"New York\"\n\n    def test_get_tool_call_from_text_numeric_args(self):\n        text = '{\"name\": \"calculator\", \"arguments\": 42}'\n        result = get_tool_call_from_text(text, \"name\", \"arguments\")\n        assert result.function.name == \"calculator\"\n        assert result.function.arguments == 42\n\n\n@pytest.mark.parametrize(\n    \"model_class,model_id\",\n    [\n        (LiteLLMModel, \"gpt-4o-mini\"),\n        (OpenAIModel, \"gpt-4o-mini\"),\n    ],\n)\ndef test_tool_calls_json_serialization(model_class, model_id):\n    \"\"\"Test that tool_calls from various API models (Pydantic, dataclass, dict) are properly converted to dataclasses and can be JSON serialized.\n    This tests the horizontal fix that ensures all models (LiteLLM, OpenAI, InferenceClient, AmazonBedrock)\n    properly convert tool_calls to dataclasses regardless of the source format (Pydantic models, dataclasses, or dicts).\n    \"\"\"\n    tool_arguments = \"test_result\"\n    messages = [\n        ChatMessage(\n            role=MessageRole.USER,\n            content=[\n                {\n                    \"type\": \"text\",\n                    \"text\": \"Hello! Please return the final answer 'hi there' in a tool call\",\n                }\n            ],\n        ),\n    ]\n\n    if model_class == OpenAIModel:\n        from openai.types.chat.chat_completion import ChatCompletion, Choice\n        from openai.types.chat.chat_completion_message import ChatCompletionMessage\n        from openai.types.chat.chat_completion_message_tool_call import ChatCompletionMessageToolCall, Function\n        from openai.types.completion_usage import CompletionUsage\n\n        response = ChatCompletion(\n            id=\"chatcmpl-test\",\n            created=0,\n            model=\"gpt-4o-mini-2024-07-18\",\n            object=\"chat.completion\",\n            choices=[\n                Choice(\n                    finish_reason=\"tool_calls\",\n                    index=0,\n                    logprobs=None,\n                    message=ChatCompletionMessage(\n                        role=\"assistant\",\n                        content=None,\n                        tool_calls=[\n                            ChatCompletionMessageToolCall(\n                                id=\"call_test\",\n                                type=\"function\",\n                                function=Function(name=\"final_answer\", arguments=tool_arguments),\n                            )\n                        ],\n                    ),\n                )\n            ],\n            usage=CompletionUsage(prompt_tokens=69, completion_tokens=15, total_tokens=84),\n        )\n        client = MagicMock()\n        client.chat.completions.create.return_value = response\n        create_call = client.chat.completions.create\n        patch_target = \"smolagents.models.OpenAIModel.create_client\"\n    elif model_class == LiteLLMModel:\n        from litellm.types.utils import ChatCompletionMessageToolCall, Choices, Function, Message, ModelResponse, Usage\n\n        response = ModelResponse(\n            id=\"chatcmpl-test\",\n            created=0,\n            object=\"chat.completion\",\n            choices=[\n                Choices(\n                    finish_reason=\"tool_calls\",\n                    index=0,\n                    message=Message(\n                        role=\"assistant\",\n                        content=None,\n                        tool_calls=[\n                            ChatCompletionMessageToolCall(\n                                id=\"call_test\",\n                                type=\"function\",\n                                function=Function(name=\"final_answer\", arguments=tool_arguments),\n                            )\n                        ],\n                        function_call=None,\n                        provider_specific_fields={\"refusal\": None, \"annotations\": []},\n                    ),\n                )\n            ],\n            usage=Usage(prompt_tokens=69, completion_tokens=15, total_tokens=84),\n            model=\"gpt-4o-mini-2024-07-18\",\n        )\n        client = MagicMock()\n        client.completion.return_value = response\n        create_call = client.completion\n        patch_target = \"smolagents.models.LiteLLMModel.create_client\"\n    else:\n        raise ValueError(f\"Unexpected model class: {model_class}\")\n\n    with patch(patch_target, return_value=client):\n        model = model_class(model_id=model_id)\n        result = model.generate(messages, tools_to_call_from=[FinalAnswerTool()])\n\n    assert create_call.call_count == 1\n\n    # Verify tool_calls are converted to dataclasses\n    assert result.tool_calls is not None\n    assert len(result.tool_calls) > 0\n    assert isinstance(result.tool_calls[0], ChatMessageToolCall)\n\n    # The critical test: verify JSON serialization works\n    json_str = result.model_dump_json()\n    data = json.loads(json_str)\n    assert \"tool_calls\" in data\n    assert len(data[\"tool_calls\"]) > 0\n    assert data[\"tool_calls\"][0][\"function\"][\"name\"] == \"final_answer\"\n    assert data[\"tool_calls\"][0][\"function\"][\"arguments\"] == \"test_result\"\n"
  },
  {
    "path": "tests/test_monitoring.py",
    "content": "# coding=utf-8\n# Copyright 2024 HuggingFace Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\nimport unittest\n\nimport PIL.Image\nimport pytest\nfrom rich.console import Console\n\nfrom smolagents import (\n    CodeAgent,\n    ToolCallingAgent,\n    stream_to_gradio,\n)\nfrom smolagents.memory import ActionStep, AgentMemory\nfrom smolagents.models import (\n    ChatMessage,\n    ChatMessageToolCall,\n    ChatMessageToolCallFunction,\n    MessageRole,\n    Model,\n)\nfrom smolagents.monitoring import AgentLogger, TokenUsage\n\n\nclass FakeLLMModel(Model):\n    def generate(self, prompt, tools_to_call_from=None, **kwargs):\n        if tools_to_call_from is not None:\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"I will call the final_answer tool.\",\n                tool_calls=[\n                    ChatMessageToolCall(\n                        id=\"fake_id\",\n                        type=\"function\",\n                        function=ChatMessageToolCallFunction(\n                            name=\"final_answer\", arguments={\"answer\": \"This is the final answer.\"}\n                        ),\n                    )\n                ],\n                token_usage=TokenUsage(input_tokens=10, output_tokens=20),\n            )\n        else:\n            return ChatMessage(\n                role=MessageRole.ASSISTANT,\n                content=\"\"\"<code>\nfinal_answer('This is the final answer.')\n</code>\"\"\",\n                token_usage=TokenUsage(input_tokens=10, output_tokens=20),\n            )\n\n\nclass MonitoringTester(unittest.TestCase):\n    def test_code_agent_metrics_max_steps(self):\n        class FakeLLMModelMalformedAnswer(Model):\n            def generate(self, prompt, **kwargs):\n                return ChatMessage(\n                    role=MessageRole.ASSISTANT,\n                    content=\"Malformed answer\",\n                    token_usage=TokenUsage(input_tokens=10, output_tokens=20),\n                )\n\n        agent = CodeAgent(\n            tools=[],\n            model=FakeLLMModelMalformedAnswer(),\n            max_steps=1,\n        )\n\n        agent.run(\"Fake task\")\n\n        self.assertEqual(agent.monitor.total_input_token_count, 20)\n        self.assertEqual(agent.monitor.total_output_token_count, 40)\n\n    def test_code_agent_metrics_generation_error(self):\n        class FakeLLMModelGenerationException(Model):\n            def generate(self, prompt, **kwargs):\n                raise Exception(\"Cannot generate\")\n\n        agent = CodeAgent(\n            tools=[],\n            model=FakeLLMModelGenerationException(),\n            max_steps=1,\n        )\n        with pytest.raises(Exception) as e:\n            agent.run(\"Fake task\")\n        assert \"Cannot generate\" in str(e.value)\n\n    def test_streaming_agent_text_output(self):\n        agent = CodeAgent(\n            tools=[],\n            model=FakeLLMModel(),\n            max_steps=1,\n            planning_interval=2,\n        )\n\n        # Use stream_to_gradio to capture the output\n        outputs = list(stream_to_gradio(agent, task=\"Test task\"))\n\n        self.assertEqual(len(outputs), 11)\n        plan_message = outputs[1]\n        self.assertEqual(plan_message.role, \"assistant\")\n        self.assertIn(\"<code>\", plan_message.content)\n        final_message = outputs[-1]\n        self.assertEqual(final_message.role, \"assistant\")\n        self.assertIn(\"This is the final answer.\", final_message.content)\n\n    def test_streaming_agent_image_output(self):\n        class FakeLLMModelImage(Model):\n            def generate(self, prompt, **kwargs):\n                return ChatMessage(\n                    role=MessageRole.ASSISTANT,\n                    content=\"I will call the final_answer tool.\",\n                    tool_calls=[\n                        ChatMessageToolCall(\n                            id=\"fake_id\",\n                            type=\"function\",\n                            function=ChatMessageToolCallFunction(name=\"final_answer\", arguments={\"answer\": \"image\"}),\n                        )\n                    ],\n                )\n\n        agent = ToolCallingAgent(\n            tools=[],\n            model=FakeLLMModelImage(),\n            max_steps=1,\n            verbosity_level=100,\n        )\n\n        # Use stream_to_gradio to capture the output\n        outputs = list(\n            stream_to_gradio(\n                agent,\n                task=\"Test task\",\n                additional_args=dict(image=PIL.Image.new(\"RGB\", (100, 100))),\n            )\n        )\n\n        self.assertEqual(len(outputs), 7)\n        final_message = outputs[-1]\n        self.assertEqual(final_message.role, \"assistant\")\n        self.assertIsInstance(final_message.content, dict)\n        self.assertEqual(final_message.content[\"mime_type\"], \"image/png\")\n\n    def test_streaming_with_agent_error(self):\n        class DummyModel(Model):\n            def generate(self, prompt, **kwargs):\n                return ChatMessage(role=MessageRole.ASSISTANT, content=\"Malformed call\")\n\n        agent = CodeAgent(\n            tools=[],\n            model=DummyModel(),\n            max_steps=1,\n        )\n\n        # Use stream_to_gradio to capture the output\n        outputs = list(stream_to_gradio(agent, task=\"Test task\"))\n\n        self.assertEqual(len(outputs), 11)\n        final_message = outputs[-1]\n        self.assertEqual(final_message.role, \"assistant\")\n        self.assertIn(\"Malformed call\", final_message.content)\n\n\n@pytest.mark.parametrize(\"agent_class\", [CodeAgent, ToolCallingAgent])\ndef test_code_agent_metrics(agent_class):\n    agent = agent_class(\n        tools=[],\n        model=FakeLLMModel(),\n        max_steps=1,\n    )\n    agent.run(\"Fake task\")\n\n    assert agent.monitor.total_input_token_count == 10\n    assert agent.monitor.total_output_token_count == 20\n\n\nclass ReplayTester(unittest.TestCase):\n    def test_replay_with_chatmessage(self):\n        \"\"\"Regression test for dict(message) to message.dict() fix\"\"\"\n        logger = AgentLogger()\n        memory = AgentMemory(system_prompt=\"test\")\n        step = ActionStep(step_number=1, timing=0)\n        step.model_input_messages = [ChatMessage(role=MessageRole.USER, content=\"Hello\")]\n        memory.steps.append(step)\n\n        try:\n            memory.replay(logger, detailed=True)\n        except TypeError as e:\n            self.fail(f\"Replay raised an error: {e}\")\n\n\nclass AgentLoggerLogTaskTester(unittest.TestCase):\n    def test_logger_log_task_does_not_crash_on_stray_markup_or_control_chars(self):\n        \"\"\"\n        Rich Panels parse `title`/`subtitle` as markup when passed as strings.\n        `AgentLogger.log_task()` must be resilient to arbitrary content/subtitle strings\n        (e.g. tool logs, binary-ish payloads, or stray bracket sequences).\n        \"\"\"\n        console = Console(record=True, width=120, highlight=False)\n        logger = AgentLogger(console=console)\n\n        # These inputs would crash Rich markup parsing if passed through as markup strings.\n        content = b\"hello [/bad]\\x00\\x1b world [bold]bold[/bold]\"\n        subtitle = \"sub[/bad]title\"\n\n        logger.log_task(content=content, subtitle=subtitle, title=None)\n\n        rendered = console.export_text()\n        self.assertIn(\"hello [/bad]\", rendered)\n        # Control chars are made visible as escape sequences.\n        self.assertIn(\"\\\\x00\", rendered)\n        self.assertIn(\"\\\\x1b\", rendered)\n        self.assertIn(\"sub[/bad]title\", rendered)\n        self.assertIn(\"bold\", rendered)\n\n    def test_logger_log_task_accepts_non_string_payloads(self):\n        console = Console(record=True, width=120, highlight=False)\n        logger = AgentLogger(console=console)\n\n        logger.log_task(content={\"k\": [\"v\", 1]}, subtitle={\"also\": \"dict\"}, title=\"Run\")\n        rendered = console.export_text()\n        self.assertIn(\"k\", rendered)\n        self.assertIn(\"also\", rendered)\n"
  },
  {
    "path": "tests/test_remote_executors.py",
    "content": "import importlib\nimport io\nfrom textwrap import dedent\nfrom unittest.mock import MagicMock, patch\n\nimport docker\nimport PIL.Image\nimport pytest\nfrom rich.console import Console\n\nfrom smolagents.default_tools import FinalAnswerTool, WikipediaSearchTool\nfrom smolagents.local_python_executor import CodeOutput\nfrom smolagents.monitoring import AgentLogger, LogLevel\nfrom smolagents.remote_executors import (\n    BlaxelExecutor,\n    DockerExecutor,\n    E2BExecutor,\n    ModalExecutor,\n    RemotePythonExecutor,\n    WasmExecutor,\n)\nfrom smolagents.serialization import SerializationError\nfrom smolagents.utils import AgentError\n\nfrom .utils.markers import require_run_all\n\n\nclass TestRemotePythonExecutor:\n    def test_send_tools_empty_tools(self):\n        executor = RemotePythonExecutor(additional_imports=[], logger=MagicMock())\n        executor.run_code_raise_errors = MagicMock()\n        executor.send_tools({})\n        assert executor.run_code_raise_errors.call_count == 1\n        # No new packages should be installed\n        assert \"!pip install\" not in executor.run_code_raise_errors.call_args.args[0]\n\n    def test_send_variables_with_empty_dict_is_noop(self):\n        executor = RemotePythonExecutor(additional_imports=[], logger=MagicMock())\n        executor.run_code_raise_errors = MagicMock()\n        executor.send_variables({})\n        assert executor.run_code_raise_errors.call_count == 0\n\n    def test_send_variables_non_empty_generates_executable_deserializer_code(self):\n        executor = RemotePythonExecutor(additional_imports=[], logger=MagicMock(), allow_pickle=False)\n        executor.run_code_raise_errors = MagicMock()\n\n        variables = {\n            \"counter\": 1,\n            \"tags\": (\"a\", \"b\"),\n            \"blob\": b\"binary\",\n        }\n        executor.send_variables(variables)\n\n        sent_code = executor.run_code_raise_errors.call_args.args[0]\n        remote_scope = {}\n        exec(sent_code, remote_scope, remote_scope)\n\n        assert remote_scope[\"counter\"] == 1\n        assert remote_scope[\"tags\"] == (\"a\", \"b\")\n        assert remote_scope[\"blob\"] == b\"binary\"\n\n    def test_send_variables_allow_pickle_handles_prefixed_payload(self):\n        executor = RemotePythonExecutor(additional_imports=[], logger=MagicMock(), allow_pickle=True)\n        executor.run_code_raise_errors = MagicMock()\n\n        variables = {\"error\": ValueError(\"boom\")}\n        executor.send_variables(variables)\n\n        sent_code = executor.run_code_raise_errors.call_args.args[0]\n        remote_scope = {}\n        exec(sent_code, remote_scope, remote_scope)\n\n        assert isinstance(remote_scope[\"error\"], ValueError)\n        assert str(remote_scope[\"error\"]) == \"boom\"\n\n    def test_deserialize_final_answer_rejects_unprefixed_payload(self):\n        with pytest.raises(SerializationError, match=\"Unknown final answer format\"):\n            RemotePythonExecutor._deserialize_final_answer(\"legacy-unprefixed-payload\", allow_pickle=True)\n\n    @require_run_all\n    def test_send_tools_with_default_wikipedia_search_tool(self):\n        tool = WikipediaSearchTool()\n        executor = RemotePythonExecutor(additional_imports=[], logger=MagicMock())\n        executor.run_code_raise_errors = MagicMock()\n        executor.send_tools({\"wikipedia_search\": tool})\n        assert executor.run_code_raise_errors.call_count == 2\n        assert \"!pip install wikipedia-api\" == executor.run_code_raise_errors.call_args_list[0].args[0]\n        assert \"class WikipediaSearchTool(Tool)\" in executor.run_code_raise_errors.call_args_list[1].args[0]\n\n\nclass TestE2BExecutorUnit:\n    def test_e2b_executor_instantiation(self):\n        logger = MagicMock()\n        with patch(\"e2b_code_interpreter.Sandbox\") as mock_sandbox:\n            mock_sandbox.return_value.commands.run.return_value.error = None\n            mock_sandbox.return_value.run_code.return_value.error = None\n            # Also set up v2 path in case Sandbox.create is used\n            mock_sandbox.create.return_value.commands.run.return_value.error = None\n            mock_sandbox.create.return_value.run_code.return_value.error = None\n            executor = E2BExecutor(\n                additional_imports=[], logger=logger, api_key=\"dummy-api-key\", template=\"dummy-template-id\", timeout=60\n            )\n        assert isinstance(executor, E2BExecutor)\n        assert executor.logger == logger\n        # Support both e2b v1 (Sandbox(...)) and v2 (Sandbox.create(...))\n        if mock_sandbox.create.called:\n            sandbox_obj = mock_sandbox.create.return_value\n            called_ctor = mock_sandbox.create\n        else:\n            sandbox_obj = mock_sandbox.return_value\n            called_ctor = mock_sandbox\n        assert executor.sandbox == sandbox_obj\n        assert called_ctor.call_count == 1\n        assert called_ctor.call_args.kwargs == {\n            \"api_key\": \"dummy-api-key\",\n            \"template\": \"dummy-template-id\",\n            \"timeout\": 60,\n        }\n\n    def test_cleanup(self):\n        \"\"\"Test that the cleanup method properly shuts down the sandbox\"\"\"\n        logger = MagicMock()\n        with patch(\"e2b_code_interpreter.Sandbox\") as mock_sandbox:\n            # Setup mock\n            mock_sandbox.return_value.kill = MagicMock()\n            # Also set up v2 path in case Sandbox.create is used\n            mock_sandbox.create.return_value.kill = MagicMock()\n\n            # Create executor\n            executor = E2BExecutor(additional_imports=[], logger=logger, api_key=\"dummy-api-key\")\n\n            # Call cleanup\n            executor.cleanup()\n\n            # Verify sandbox was killed\n            if mock_sandbox.create.called:\n                mock_sandbox.create.return_value.kill.assert_called_once()\n            else:\n                mock_sandbox.return_value.kill.assert_called_once()\n            assert logger.log.call_count >= 2  # Should log start and completion messages\n\n\n@pytest.fixture\ndef e2b_executor():\n    executor = E2BExecutor(\n        additional_imports=[\"pillow\", \"numpy\"],\n        logger=AgentLogger(LogLevel.INFO, Console(force_terminal=False, file=io.StringIO())),\n    )\n    yield executor\n    executor.cleanup()\n\n\n@require_run_all\nclass TestE2BExecutorIntegration:\n    @pytest.fixture(autouse=True)\n    def set_executor(self, e2b_executor):\n        self.executor = e2b_executor\n\n    @pytest.mark.parametrize(\n        \"code_action, expected_result\",\n        [\n            (\n                dedent('''\n                    final_answer(\"\"\"This is\n                    a multiline\n                    final answer\"\"\")\n                '''),\n                \"This is\\na multiline\\nfinal answer\",\n            ),\n            (\n                dedent(\"\"\"\n                    text = '''Text containing\n                    final_answer(5)\n                    '''\n                    final_answer(text)\n                \"\"\"),\n                \"Text containing\\nfinal_answer(5)\\n\",\n            ),\n            (\n                dedent(\"\"\"\n                    num = 2\n                    if num == 1:\n                        final_answer(\"One\")\n                    elif num == 2:\n                        final_answer(\"Two\")\n                \"\"\"),\n                \"Two\",\n            ),\n        ],\n    )\n    def test_final_answer_patterns(self, code_action, expected_result):\n        self.executor.send_tools({\"final_answer\": FinalAnswerTool()})\n        code_output = self.executor(code_action)\n        assert code_output.is_final_answer is True\n        assert code_output.output == expected_result\n\n    def test_custom_final_answer(self):\n        class CustomFinalAnswerTool(FinalAnswerTool):\n            def forward(self, answer: str) -> str:\n                return \"CUSTOM\" + answer\n\n        self.executor.send_tools({\"final_answer\": CustomFinalAnswerTool()})\n        code_action = dedent(\"\"\"\n            final_answer(answer=\"_answer\")\n        \"\"\")\n        code_output = self.executor(code_action)\n        assert code_output.is_final_answer is True\n        assert code_output.output == \"CUSTOM_answer\"\n\n    def test_custom_final_answer_with_custom_inputs(self):\n        class CustomFinalAnswerToolWithCustomInputs(FinalAnswerTool):\n            inputs = {\n                \"answer1\": {\"type\": \"string\", \"description\": \"First part of the answer.\"},\n                \"answer2\": {\"type\": \"string\", \"description\": \"Second part of the answer.\"},\n            }\n\n            def forward(self, answer1: str, answer2: str) -> str:\n                return answer1 + \"CUSTOM\" + answer2\n\n        self.executor.send_tools({\"final_answer\": CustomFinalAnswerToolWithCustomInputs()})\n        code_action = dedent(\"\"\"\n            final_answer(\n                answer1=\"answer1_\",\n                answer2=\"_answer2\"\n            )\n        \"\"\")\n        code_output = self.executor(code_action)\n        assert code_output.is_final_answer is True\n        assert code_output.output == \"answer1_CUSTOM_answer2\"\n\n\nclass TestDockerExecutorUnit:\n    def test_cleanup(self):\n        \"\"\"Test that cleanup properly stops and removes the container\"\"\"\n        logger = MagicMock()\n        with (\n            patch(\"docker.from_env\") as mock_docker_client,\n            patch(\"requests.get\") as mock_get,\n            patch(\"requests.post\") as mock_post,\n            patch(\"websocket.create_connection\"),\n        ):\n            # Setup mocks\n            mock_container = MagicMock()\n            mock_container.status = \"running\"\n            mock_container.short_id = \"test123\"\n\n            mock_docker_client.return_value.containers.run.return_value = mock_container\n            mock_docker_client.return_value.images.get.return_value = MagicMock()\n\n            mock_get.return_value.status_code = 200\n            mock_post.return_value.status_code = 201\n            mock_post.return_value.json.return_value = {\"id\": \"test-kernel-id\"}\n\n            # Create executor\n            executor = DockerExecutor(additional_imports=[], logger=logger, build_new_image=False)\n\n            # Call cleanup\n            executor.cleanup()\n\n            # Verify container was stopped and removed\n            mock_container.stop.assert_called_once()\n            mock_container.remove.assert_called_once()\n\n\nclass CommonDockerExecutorIntegration:\n    @pytest.fixture(autouse=True)\n    def set_executor(self, custom_executor):\n        self.executor = custom_executor\n\n    def test_state_persistence(self):\n        \"\"\"Test that variables and imports form one snippet persist in the next\"\"\"\n        code_action = \"import numpy as np; a = 2\"\n        self.executor(code_action)\n\n        code_action = \"print(np.sqrt(a))\"\n        code_output = self.executor(code_action)\n        assert \"1.41421\" in code_output.logs\n\n    def test_execute_output(self):\n        \"\"\"Test execution that returns a string\"\"\"\n        self.executor.send_tools({\"final_answer\": FinalAnswerTool()})\n        code_action = 'final_answer(\"This is the final answer\")'\n        code_output = self.executor(code_action)\n        assert code_output.output == \"This is the final answer\", \"Result should be 'This is the final answer'\"\n\n    def test_execute_multiline_output(self):\n        \"\"\"Test execution that returns a string\"\"\"\n        self.executor.send_tools({\"final_answer\": FinalAnswerTool()})\n        code_action = 'result = \"This is the final answer\"\\nfinal_answer(result)'\n        code_output = self.executor(code_action)\n        assert code_output.output == \"This is the final answer\", \"Result should be 'This is the final answer'\"\n\n    def test_execute_image_output(self):\n        \"\"\"Test execution that returns a base64 image\"\"\"\n        self.executor.send_tools({\"final_answer\": FinalAnswerTool()})\n        code_action = dedent(\"\"\"\n            import base64\n            from PIL import Image\n            from io import BytesIO\n            image = Image.new(\"RGB\", (10, 10), (255, 0, 0))\n            final_answer(image)\n        \"\"\")\n        code_output = self.executor(code_action)\n        assert isinstance(code_output.output, PIL.Image.Image), \"Result should be a PIL Image\"\n\n    def test_syntax_error_handling(self):\n        \"\"\"Test handling of syntax errors\"\"\"\n        code_action = 'print(\"Missing Parenthesis'  # Syntax error\n        with pytest.raises(AgentError) as exception_info:\n            self.executor(code_action)\n        assert \"SyntaxError\" in str(exception_info.value), \"Should raise a syntax error\"\n\n    @pytest.mark.parametrize(\n        \"code_action, expected_result\",\n        [\n            (\n                dedent('''\n                    final_answer(\"\"\"This is\n                    a multiline\n                    final answer\"\"\")\n                '''),\n                \"This is\\na multiline\\nfinal answer\",\n            ),\n            (\n                dedent(\"\"\"\n                    text = '''Text containing\n                    final_answer(5)\n                    '''\n                    final_answer(text)\n                \"\"\"),\n                \"Text containing\\nfinal_answer(5)\\n\",\n            ),\n            (\n                dedent(\"\"\"\n                    num = 2\n                    if num == 1:\n                        final_answer(\"One\")\n                    elif num == 2:\n                        final_answer(\"Two\")\n                \"\"\"),\n                \"Two\",\n            ),\n        ],\n    )\n    def test_final_answer_patterns(self, code_action, expected_result):\n        self.executor.send_tools({\"final_answer\": FinalAnswerTool()})\n        code_output = self.executor(code_action)\n        assert code_output.is_final_answer is True\n        assert code_output.output == expected_result\n\n    def test_custom_final_answer(self):\n        class CustomFinalAnswerTool(FinalAnswerTool):\n            def forward(self, answer: str) -> str:\n                return \"CUSTOM\" + answer\n\n        self.executor.send_tools({\"final_answer\": CustomFinalAnswerTool()})\n        code_action = dedent(\"\"\"\n            final_answer(answer=\"_answer\")\n        \"\"\")\n        code_output = self.executor(code_action)\n        assert code_output.is_final_answer is True\n        assert code_output.output == \"CUSTOM_answer\"\n\n    def test_custom_final_answer_with_custom_inputs(self):\n        class CustomFinalAnswerToolWithCustomInputs(FinalAnswerTool):\n            inputs = {\n                \"answer1\": {\"type\": \"string\", \"description\": \"First part of the answer.\"},\n                \"answer2\": {\"type\": \"string\", \"description\": \"Second part of the answer.\"},\n            }\n\n            def forward(self, answer1: str, answer2: str) -> str:\n                return answer1 + \"CUSTOM\" + answer2\n\n        self.executor.send_tools({\"final_answer\": CustomFinalAnswerToolWithCustomInputs()})\n        code_action = dedent(\"\"\"\n            final_answer(\n                answer1=\"answer1_\",\n                answer2=\"_answer2\"\n            )\n        \"\"\")\n        code_output = self.executor(code_action)\n        assert code_output.is_final_answer is True\n        assert code_output.output == \"answer1_CUSTOM_answer2\"\n\n\n@require_run_all\nclass TestDockerExecutorIntegration(CommonDockerExecutorIntegration):\n    @pytest.fixture\n    def custom_executor(self):\n        executor = DockerExecutor(\n            additional_imports=[\"pillow\", \"numpy\"],\n            logger=AgentLogger(LogLevel.INFO, Console(force_terminal=False, file=io.StringIO())),\n        )\n        yield executor\n        executor.delete()\n\n    def test_initialization(self):\n        \"\"\"Check if DockerExecutor initializes without errors\"\"\"\n        assert self.executor.container is not None, \"Container should be initialized\"\n\n    def test_cleanup_on_deletion(self):\n        \"\"\"Test if Docker container stops and removes on deletion\"\"\"\n        container_id = self.executor.container.id\n        self.executor.delete()  # Trigger cleanup\n\n        client = docker.from_env()\n        containers = [c.id for c in client.containers.list(all=True)]\n        assert container_id not in containers, \"Container should be removed\"\n\n\n@require_run_all\nclass TestModalExecutorIntegration(CommonDockerExecutorIntegration):\n    @pytest.fixture\n    def custom_executor(self):\n        executor = ModalExecutor(\n            additional_imports=[\"pillow\", \"numpy\"],\n            logger=AgentLogger(LogLevel.INFO, Console(force_terminal=False, file=io.StringIO())),\n        )\n        yield executor\n        executor.delete()\n\n\nclass TestModalExecutorUnit:\n    @patch(\"smolagents.remote_executors._websocket_run_code_raise_errors\")\n    @patch(\"requests.post\")\n    @patch(\"requests.get\")\n    @patch(\"websocket.create_connection\")\n    @patch(\"modal.App.lookup\")\n    @patch(\"modal.Sandbox.create\")\n    def test_sandbox_lifecycle(\n        self, mock_sandbox_create, mock_app_lookup, mock_create_connection, mock_get, mock_post, mock_run_code_raises\n    ):\n        \"\"\"Test that sandbox is created with the correct kwargs and cleaned up correctly.\"\"\"\n        modal = pytest.importorskip(\"modal\")\n        port = 8889\n\n        logger = MagicMock()\n        mock_sandbox = MagicMock()\n        tunnel_mock = MagicMock()\n        tunnel_mock.host = \"r4234.modal.host\"\n        mock_sandbox.tunnels.return_value = {port: tunnel_mock}\n\n        mock_get.return_value.status_code = 200\n        mock_post.return_value.status_code = 201\n        mock_post.return_value.json.return_value = {\"id\": \"test-kernel-id\"}\n        mock_run_code_raises.return_value = CodeOutput(output=\"3\", logs=\"\", is_final_answer=False)\n        mock_sandbox_create.return_value = mock_sandbox\n\n        executor = ModalExecutor(\n            additional_imports=[],\n            logger=logger,\n            app_name=\"my-custom-app-name\",\n            port=port,\n            create_kwargs={\n                \"secrets\": [modal.Secret.from_dict({\"MY_SECRET\": \"ABC\"})],\n                \"timeout\": 100,\n                \"cpu\": 2,\n            },\n        )\n\n        create_call = mock_sandbox_create.mock_calls[0]\n        assert create_call.args == (\n            \"jupyter\",\n            \"kernelgateway\",\n            \"--KernelGatewayApp.ip=0.0.0.0\",\n            f\"--KernelGatewayApp.port={port}\",\n        )\n        assert create_call.kwargs[\"timeout\"] == 100\n        assert create_call.kwargs[\"cpu\"] == 2\n        assert len(create_call.kwargs[\"secrets\"]) == 2\n        mock_app_lookup.assert_called_with(\"my-custom-app-name\", create_if_missing=True)\n\n        executor.run_code_raise_errors(\"1 + 2\")\n        executor.cleanup()\n        mock_sandbox.terminate.assert_called()\n\n\nclass TestWasmExecutorUnit:\n    def test_wasm_executor_instantiation(self):\n        logger = MagicMock()\n\n        # Mock subprocess.run to simulate Deno being installed\n        with (\n            patch(\"subprocess.run\") as mock_run,\n            patch(\"subprocess.Popen\") as mock_popen,\n            patch(\"smolagents.remote_executors.requests.Session\") as mock_session_cls,\n            patch(\"time.sleep\"),\n        ):\n            # Configure mocks\n            mock_run.return_value.returncode = 0\n            mock_process = MagicMock()\n            mock_process.poll.return_value = None\n            mock_popen.return_value = mock_process\n            mock_session = MagicMock()\n            mock_session.get.return_value.status_code = 200\n            mock_session_cls.return_value = mock_session\n\n            # Create the executor\n            executor = WasmExecutor(additional_imports=[\"numpy\", \"pandas\"], logger=logger, timeout=30)\n\n            # Verify the executor was created correctly\n            assert isinstance(executor, WasmExecutor)\n            assert executor.logger == logger\n            assert executor.timeout == 30\n            assert \"numpy\" in executor.installed_packages\n            assert \"pandas\" in executor.installed_packages\n\n            # Verify Deno was checked\n            assert mock_run.call_count == 1\n            assert mock_run.call_args.args[0][0] == \"deno\"\n            assert mock_run.call_args.args[0][1] == \"--version\"\n\n            # Verify server was started\n            assert mock_popen.call_count == 1\n            assert mock_popen.call_args.args[0][0] == \"deno\"\n            assert mock_popen.call_args.args[0][1] == \"run\"\n            assert (\n                \"--allow-net=127.0.0.1:8000,cdn.jsdelivr.net:443,pypi.org:443,files.pythonhosted.org:443\"\n                in (mock_popen.call_args.args[0])\n            )\n\n            # Clean up\n            with patch(\"shutil.rmtree\"):\n                executor.cleanup()\n\n\n@require_run_all\nclass TestWasmExecutorIntegration:\n    \"\"\"\n    Integration tests for WasmExecutor.\n\n    These tests require Deno to be installed on the system.\n    Skip these tests if you don't have Deno installed.\n    \"\"\"\n\n    @pytest.fixture(autouse=True)\n    def setup_and_teardown(self):\n        \"\"\"Setup and teardown for each test.\"\"\"\n        try:\n            # Check if Deno is installed\n            import subprocess\n\n            subprocess.run([\"deno\", \"--version\"], capture_output=True, check=True)\n\n            # Create the executor\n            self.executor = WasmExecutor(\n                additional_imports=[\"numpy\", \"pandas\"],\n                logger=AgentLogger(LogLevel.INFO, Console(force_terminal=False, file=io.StringIO())),\n                timeout=60,\n            )\n            yield\n            # Clean up\n            self.executor.cleanup()\n        except (subprocess.SubprocessError, FileNotFoundError):\n            pytest.skip(\"Deno is not installed, skipping integration tests\")\n\n    def test_basic_execution(self):\n        \"\"\"Test basic code execution.\"\"\"\n        code = \"a = 2 + 2; print(f'Result: {a}')\"\n        code_output = self.executor(code)\n        assert \"Result: 4\" in code_output.logs\n\n    def test_state_persistence(self):\n        \"\"\"Test that variables persist between executions.\"\"\"\n        # Define a variable\n        self.executor(\"x = 42\")\n\n        # Use the variable in a subsequent execution\n        code_output = self.executor(\"print(x)\")\n        assert \"42\" in code_output.logs\n\n    def test_final_answer(self):\n        \"\"\"Test returning a final answer.\"\"\"\n        self.executor.send_tools({\"final_answer\": FinalAnswerTool()})\n        code = 'final_answer(\"This is the final answer\")'\n        code_output = self.executor(code)\n        assert code_output.output == \"This is the final answer\"\n        assert code_output.is_final_answer is True\n\n    def test_numpy_execution(self):\n        \"\"\"Test execution with NumPy.\"\"\"\n        code = \"\"\"\n        import numpy as np\n        arr = np.array([1, 2, 3, 4, 5])\n        print(f\"Mean: {np.mean(arr)}\")\n        \"\"\"\n        code_output = self.executor(code)\n        assert \"Mean: 3.0\" in code_output.logs\n\n    def test_error_handling(self):\n        \"\"\"Test handling of Python errors.\"\"\"\n        code = \"1/0\"  # Division by zero\n        with pytest.raises(AgentError) as excinfo:\n            self.executor(code)\n        assert \"ZeroDivisionError\" in str(excinfo.value)\n\n    def test_syntax_error_handling(self):\n        \"\"\"Test handling of syntax errors.\"\"\"\n        code = \"print('Missing parenthesis\"  # Missing closing parenthesis\n        with pytest.raises(AgentError) as excinfo:\n            self.executor(code)\n        assert \"SyntaxError\" in str(excinfo.value)\n\n\nclass TestBlaxelExecutorUnit:\n    def test_blaxel_executor_instantiation_without_blaxel_sdk(self):\n        \"\"\"Test that BlaxelExecutor raises appropriate error when blaxel SDK is not installed.\"\"\"\n        logger = MagicMock()\n        with patch.dict(\"sys.modules\", {\"blaxel.core\": None}):\n            with pytest.raises(ModuleNotFoundError) as excinfo:\n                BlaxelExecutor(additional_imports=[], logger=logger)\n            assert \"Please install 'blaxel' extra\" in str(excinfo.value)\n\n    @patch(\"smolagents.remote_executors._create_kernel_http\")\n    @patch(\"blaxel.core.SandboxInstance\")\n    @patch(\"blaxel.core.settings\")\n    def test_blaxel_executor_instantiation_with_blaxel_sdk(\n        self, mock_settings, mock_sandbox_instance, mock_create_kernel\n    ):\n        \"\"\"Test BlaxelExecutor instantiation with mocked Blaxel SDK.\"\"\"\n\n        # patch manually for Python 3.10 compatibility\n        from unittest.mock import patch\n\n        mod = importlib.import_module(\"blaxel.core.client.api.compute\")\n        patcher = patch.object(mod, \"create_sandbox\")\n        mock_create_sandbox = patcher.start()\n\n        logger = MagicMock()\n        mock_settings.headers = {}\n\n        # Mock sandbox response\n        mock_response = MagicMock()\n        mock_create_sandbox.sync.return_value = mock_response\n\n        # Mock SandboxInstance\n        mock_sandbox = MagicMock()\n        mock_metadata = MagicMock()\n        mock_metadata.url = \"https://test-sandbox.bl.run\"\n        mock_sandbox.metadata = mock_metadata\n        mock_sandbox_instance.return_value = mock_sandbox\n\n        # Mock kernel creation\n        mock_create_kernel.return_value = \"kernel-123\"\n\n        executor = BlaxelExecutor(additional_imports=[], logger=logger)\n\n        patcher.stop()\n\n        assert executor.sandbox_name.startswith(\"smolagent-executor-\")\n        assert executor.image == \"blaxel/jupyter-notebook\"\n        assert executor.memory == 4096\n        assert executor.region is None\n\n    @patch(\"smolagents.remote_executors.BlaxelExecutor.install_packages\")\n    @patch(\"smolagents.remote_executors._create_kernel_http\")\n    @patch(\"blaxel.core.SandboxInstance\")\n    @patch(\"blaxel.core.settings\")\n    def test_blaxel_executor_custom_parameters(\n        self, mock_settings, mock_sandbox_instance, mock_create_kernel, mock_install_packages\n    ):\n        \"\"\"Test BlaxelExecutor with custom parameters.\"\"\"\n        logger = MagicMock()\n        mock_settings.headers = {}\n        mock_install_packages.return_value = [\"numpy\"]\n\n        # Mock sandbox response\n        mock_response = MagicMock()\n\n        # patch manually for Python 3.10 compatibility\n        mod = importlib.import_module(\"blaxel.core.client.api.compute\")\n        create_sandbox_patcher = patch.object(mod, \"create_sandbox\")\n        mock_create_sandbox = create_sandbox_patcher.start()\n        mock_create_sandbox.sync.return_value = mock_response\n\n        # Mock SandboxInstance\n        mock_sandbox = MagicMock()\n        mock_metadata = MagicMock()\n        mock_metadata.url = \"https://test-sandbox.us-was-1.bl.run\"\n        mock_sandbox.metadata = mock_metadata\n        mock_sandbox_instance.return_value = mock_sandbox\n\n        # Mock kernel creation\n        mock_create_kernel.return_value = \"kernel-123\"\n\n        executor = BlaxelExecutor(\n            additional_imports=[\"numpy\"],\n            logger=logger,\n            sandbox_name=\"test-sandbox\",\n            image=\"custom-image:latest\",\n            memory=8192,\n            region=\"us-was-1\",\n        )\n\n        create_sandbox_patcher.stop()\n\n        assert executor.sandbox_name == \"test-sandbox\"\n        assert executor.image == \"custom-image:latest\"\n        assert executor.memory == 8192\n        assert executor.region == \"us-was-1\"\n        assert mock_install_packages.called\n\n    @patch(\"smolagents.remote_executors._create_kernel_http\")\n    @patch(\"blaxel.core.SandboxInstance\")\n    @patch(\"blaxel.core.settings\")\n    def test_blaxel_executor_cleanup(self, mock_settings, mock_sandbox_instance, mock_create_kernel):\n        \"\"\"Test BlaxelExecutor cleanup method.\"\"\"\n\n        # patch manually for Python 3.10 compatibility\n        from unittest.mock import patch\n\n        mod = importlib.import_module(\"blaxel.core.client.api.compute\")\n        create_sandbox_patcher = patch.object(mod, \"create_sandbox\")\n        mock_create_sandbox = create_sandbox_patcher.start()\n        delete_sandbox_patcher = patch.object(mod, \"delete_sandbox\")\n        mock_delete_sandbox = delete_sandbox_patcher.start()\n\n        logger = MagicMock()\n        mock_settings.headers = {}\n\n        # Mock sandbox response\n        mock_response = MagicMock()\n        mock_create_sandbox.sync.return_value = mock_response\n\n        # Mock SandboxInstance\n        mock_sandbox = MagicMock()\n        mock_metadata = MagicMock()\n        mock_metadata.url = \"https://test-sandbox.bl.run\"\n        mock_sandbox.metadata = mock_metadata\n        mock_sandbox_instance.return_value = mock_sandbox\n\n        # Mock kernel creation\n        mock_create_kernel.return_value = \"kernel-123\"\n\n        executor = BlaxelExecutor(additional_imports=[], logger=logger)\n\n        # Test cleanup\n        executor.cleanup()\n        create_sandbox_patcher.stop()\n        delete_sandbox_patcher.stop()\n\n        # Verify that delete_sandbox.sync was called\n        assert mock_delete_sandbox.sync.called\n        # Verify sandbox reference was cleaned up\n        assert not hasattr(executor, \"sandbox\")\n"
  },
  {
    "path": "tests/test_search.py",
    "content": "# coding=utf-8\n# Copyright 2024 HuggingFace Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\n\nfrom smolagents import DuckDuckGoSearchTool\n\nfrom .test_tools import ToolTesterMixin\nfrom .utils.markers import require_run_all\n\n\nclass TestDuckDuckGoSearchTool(ToolTesterMixin):\n    def setup_method(self):\n        self.tool = DuckDuckGoSearchTool()\n        self.tool.setup()\n\n    @require_run_all\n    def test_exact_match_arg(self):\n        result = self.tool(\"Agents\")\n        assert isinstance(result, str)\n\n    @require_run_all\n    def test_agent_type_output(self):\n        super().test_agent_type_output()\n"
  },
  {
    "path": "tests/test_serialization.py",
    "content": "#!/usr/bin/env python\n# coding=utf-8\n\n# Copyright 2024 The HuggingFace Inc. team. All rights reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\n\"\"\"\nComprehensive tests for SafeSerializer covering edge cases, error handling,\nperformance, and integration scenarios.\n\"\"\"\n\nimport base64\nimport json\nimport pickle\nimport warnings\nfrom datetime import datetime, timedelta\nfrom decimal import Decimal\nfrom pathlib import Path\n\nimport pytest\n\nfrom smolagents.serialization import SafeSerializer, SerializationError\n\n\n# Module-level class for pickle tests (local classes can't be pickled)\nclass PicklableCustomClass:\n    \"\"\"A simple class that can be pickled.\"\"\"\n\n    def __init__(self):\n        self.value = 42\n\n\nclass TestSafeSerializationSecurity:\n    \"\"\"Test that safe mode properly blocks pickle.\"\"\"\n\n    def test_safe_mode_blocks_custom_classes(self):\n        \"\"\"Verify custom classes cannot be serialized in safe mode.\"\"\"\n\n        class CustomClass:\n            def __init__(self):\n                self.value = 42\n\n        obj = CustomClass()\n\n        # Should raise SerializationError in safe mode\n        with pytest.raises(SerializationError, match=\"Cannot safely serialize\"):\n            SafeSerializer.dumps(obj, allow_pickle=False)\n\n    def test_safe_mode_blocks_pickle_deserialization(self):\n        \"\"\"Verify pickle data is rejected in safe mode.\"\"\"\n\n        # Create pickle data (no \"safe:\" prefix)\n        pickle_data = base64.b64encode(pickle.dumps({\"test\": \"data\"})).decode()\n\n        # Should raise error in safe mode\n        with pytest.raises(SerializationError, match=\"Pickle data rejected\"):\n            SafeSerializer.loads(pickle_data, allow_pickle=False)\n\n    def test_pickle_fallback_with_warning(self):\n        \"\"\"Verify pickle fallback works but warns in legacy mode.\"\"\"\n\n        obj = PicklableCustomClass()\n\n        # Should work but emit warning\n        with warnings.catch_warnings(record=True) as w:\n            warnings.simplefilter(\"always\")\n            serialized = SafeSerializer.dumps(obj, allow_pickle=True)\n\n            # Check warning was raised\n            assert len(w) == 1\n            assert issubclass(w[0].category, FutureWarning)\n            assert \"insecure pickle\" in str(w[0].message).lower()\n\n        # Should deserialize successfully (with warning)\n        with warnings.catch_warnings(record=True) as w:\n            warnings.simplefilter(\"always\")\n            result = SafeSerializer.loads(serialized, allow_pickle=True)\n\n            assert result.value == 42\n            assert len(w) == 1\n            assert \"pickle data\" in str(w[0].message).lower()\n\n\nclass TestSafeSerializationRoundtrip:\n    \"\"\"Test that safe types serialize and deserialize correctly.\"\"\"\n\n    def test_primitives(self):\n        \"\"\"Test basic Python types.\"\"\"\n        test_cases = [\n            None,\n            True,\n            False,\n            42,\n            3.14,\n            \"hello\",\n            b\"bytes\",\n            complex(1, 2),\n        ]\n\n        for obj in test_cases:\n            serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n            assert serialized.startswith(\"safe:\")\n            result = SafeSerializer.loads(serialized, allow_pickle=False)\n            assert result == obj\n\n    def test_collections(self):\n        \"\"\"Test collections.\"\"\"\n        test_cases = [\n            [1, 2, 3],\n            {\"key\": \"value\", \"nested\": {\"a\": 1}},\n            (1, 2, 3),\n            {1, 2, 3},\n            frozenset([1, 2, 3]),\n        ]\n\n        for obj in test_cases:\n            serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n            result = SafeSerializer.loads(serialized, allow_pickle=False)\n            assert result == obj\n\n    def test_datetime_types(self):\n        \"\"\"Test datetime module types.\"\"\"\n        now = datetime.now()\n        test_cases = [\n            now,\n            now.date(),\n            now.time(),\n            timedelta(days=1, hours=2, minutes=3),\n        ]\n\n        for obj in test_cases:\n            serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n            result = SafeSerializer.loads(serialized, allow_pickle=False)\n            assert result == obj\n\n    def test_special_types(self):\n        \"\"\"Test Decimal and Path.\"\"\"\n        test_cases = [\n            Decimal(\"3.14159\"),\n            Path(\"/tmp/test.txt\"),\n        ]\n\n        for obj in test_cases:\n            serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n            result = SafeSerializer.loads(serialized, allow_pickle=False)\n            assert result == obj\n\n    def test_complex_nested_structure(self):\n        \"\"\"Test deeply nested structures.\"\"\"\n        obj = {\n            \"primitives\": [1, 2.5, \"string\", None, True],\n            \"collections\": {\n                \"list\": [1, 2, 3],\n                \"tuple\": (4, 5, 6),\n                \"set\": {7, 8, 9},\n            },\n            \"datetime\": datetime.now(),\n            \"path\": Path(\"/tmp\"),\n            \"bytes\": b\"binary data\",\n        }\n\n        serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n        assert serialized.startswith(\"safe:\")\n        result = SafeSerializer.loads(serialized, allow_pickle=False)\n\n        # Check structure is preserved\n        assert result[\"primitives\"] == obj[\"primitives\"]\n        assert result[\"collections\"][\"list\"] == obj[\"collections\"][\"list\"]\n        assert result[\"datetime\"] == obj[\"datetime\"]\n        assert result[\"path\"] == obj[\"path\"]\n        assert result[\"bytes\"] == obj[\"bytes\"]\n\n\nclass TestBackwardCompatibility:\n    \"\"\"Test that legacy pickle data can still be read when explicitly allowed.\"\"\"\n\n    def test_read_legacy_pickle_data(self):\n        \"\"\"Verify we can read old pickle data when allow_insecure=True.\"\"\"\n\n        # Simulate legacy pickle data (no \"safe:\" prefix)\n        legacy_data = {\"key\": \"value\", \"number\": 42}\n        pickle_encoded = base64.b64encode(pickle.dumps(legacy_data)).decode()\n\n        # Should work with allow_pickle=True\n        with warnings.catch_warnings(record=True) as w:\n            warnings.simplefilter(\"always\")\n            result = SafeSerializer.loads(pickle_encoded, allow_pickle=True)\n\n            assert result == legacy_data\n            assert len(w) == 1  # Warning emitted\n            assert \"pickle data\" in str(w[0].message).lower()\n\n    def test_safe_data_is_preferred(self):\n        \"\"\"Verify safe serialization is used even when pickle is allowed.\"\"\"\n\n        # Basic dict should use safe serialization\n        obj = {\"key\": [1, 2, 3]}\n\n        with warnings.catch_warnings(record=True) as w:\n            warnings.simplefilter(\"always\")\n            serialized = SafeSerializer.dumps(obj, allow_pickle=True)\n\n            # Should use safe format (no warning)\n            assert serialized.startswith(\"safe:\")\n            assert len(w) == 0  # No warning because safe was used\n\n\nclass TestDefaultBehavior:\n    \"\"\"Test that defaults are secure.\"\"\"\n\n    def test_dumps_defaults_to_safe(self):\n        \"\"\"Verify dumps defaults to safe mode.\"\"\"\n        obj = {\"key\": \"value\"}\n\n        # Call without safe_serialization parameter - should default to True\n        serialized = SafeSerializer.dumps(obj)\n        assert serialized.startswith(\"safe:\")\n\n        # Should be deserializable in safe mode\n        result = SafeSerializer.loads(serialized)\n        assert result == obj\n\n    def test_loads_defaults_to_safe(self):\n        \"\"\"Verify loads defaults to safe mode.\"\"\"\n        # Create safe data\n        obj = {\"key\": \"value\"}\n        serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n\n        # Call without safe_serialization parameter - should default to True\n        result = SafeSerializer.loads(serialized)\n        assert result == obj\n\n        # Create pickle data\n        pickle_data = base64.b64encode(pickle.dumps(obj)).decode()\n\n        # Should reject pickle data by default\n        with pytest.raises(SerializationError, match=\"Pickle data rejected\"):\n            SafeSerializer.loads(pickle_data)\n\n\nclass TestEdgeCases:\n    \"\"\"Test edge cases and boundary conditions.\"\"\"\n\n    def test_empty_data(self):\n        \"\"\"Test serialization of empty collections.\"\"\"\n        test_cases = [\n            [],\n            {},\n            (),\n            set(),\n            frozenset(),\n            \"\",\n            b\"\",\n        ]\n\n        for obj in test_cases:\n            serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n            result = SafeSerializer.loads(serialized, allow_pickle=False)\n            assert result == obj\n\n    def test_nested_empty_structures(self):\n        \"\"\"Test deeply nested empty structures.\"\"\"\n        obj = {\n            \"empty_list\": [],\n            \"empty_dict\": {},\n            \"nested\": {\n                \"empty_tuple\": (),\n                \"empty_set\": set(),\n                \"deeply_nested\": {\"still_empty\": []},\n            },\n        }\n\n        serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n        result = SafeSerializer.loads(serialized, allow_pickle=False)\n        assert result == obj\n\n    def test_very_large_numbers(self):\n        \"\"\"Test handling of very large integers and floats.\"\"\"\n        test_cases = [\n            10**100,  # Very large int\n            -(10**100),  # Very large negative int\n            1.7976931348623157e308,  # Near max float\n            2.2250738585072014e-308,  # Near min positive float\n        ]\n\n        for obj in test_cases:\n            serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n            result = SafeSerializer.loads(serialized, allow_pickle=False)\n            assert result == obj\n\n    def test_special_float_values(self):\n        \"\"\"Test special float values (infinity, nan).\"\"\"\n        import math\n\n        # Note: NaN != NaN, so we handle it specially\n        test_cases = [\n            (float(\"inf\"), float(\"inf\")),\n            (float(\"-inf\"), float(\"-inf\")),\n        ]\n\n        for obj, expected in test_cases:\n            serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n            result = SafeSerializer.loads(serialized, allow_pickle=False)\n            assert result == expected\n\n        # NaN special case\n        nan_obj = float(\"nan\")\n        serialized = SafeSerializer.dumps(nan_obj, allow_pickle=False)\n        result = SafeSerializer.loads(serialized, allow_pickle=False)\n        assert math.isnan(result)\n\n    def test_unicode_strings(self):\n        \"\"\"Test handling of various unicode strings.\"\"\"\n        test_cases = [\n            \"Hello 世界\",  # Mixed ASCII and Chinese\n            \"🚀🎉💎\",  # Emojis\n            \"Ñoño\",  # Accented characters\n            \"\\u0000\",  # Null character\n            \"Line1\\nLine2\\tTabbed\",  # Escape sequences\n        ]\n\n        for obj in test_cases:\n            serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n            result = SafeSerializer.loads(serialized, allow_pickle=False)\n            assert result == obj\n\n    def test_very_long_strings(self):\n        \"\"\"Test handling of very long strings.\"\"\"\n        long_string = \"a\" * 1_000_000  # 1MB string\n\n        serialized = SafeSerializer.dumps(long_string, allow_pickle=False)\n        result = SafeSerializer.loads(serialized, allow_pickle=False)\n        assert result == long_string\n\n    def test_deeply_nested_structures(self):\n        \"\"\"Test deeply nested data structures.\"\"\"\n        # Create nested structure\n        obj = {\"level\": 0}\n        current = obj\n        for i in range(1, 100):  # 100 levels deep\n            current[\"nested\"] = {\"level\": i}\n            current = current[\"nested\"]\n\n        serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n        result = SafeSerializer.loads(serialized, allow_pickle=False)\n        assert result == obj\n\n    def test_dict_with_tuple_keys(self):\n        \"\"\"Test dictionaries with tuple keys.\"\"\"\n        obj = {\n            (1, 2): \"tuple_key\",\n            (3, 4, 5): \"longer_tuple\",\n            (\"a\", \"b\"): \"string_tuple\",\n        }\n\n        serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n        result = SafeSerializer.loads(serialized, allow_pickle=False)\n        assert result == obj\n\n    def test_dict_with_integer_keys(self):\n        \"\"\"Test dictionaries with non-string keys.\"\"\"\n        obj = {\n            1: \"one\",\n            2: \"two\",\n            100: \"hundred\",\n        }\n\n        serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n        result = SafeSerializer.loads(serialized, allow_pickle=False)\n        assert result == obj\n\n    def test_mixed_collection_types(self):\n        \"\"\"Test mixed collection types in one structure.\"\"\"\n        obj = {\n            \"list\": [1, 2, 3],\n            \"tuple\": (4, 5, 6),\n            \"set\": {7, 8, 9},\n            \"frozenset\": frozenset([10, 11, 12]),\n            \"nested_list\": [[1, 2], [3, 4]],\n            \"list_of_tuples\": [(1, 2), (3, 4)],\n        }\n\n        serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n        result = SafeSerializer.loads(serialized, allow_pickle=False)\n\n        # Compare each field\n        assert result[\"list\"] == obj[\"list\"]\n        assert result[\"tuple\"] == obj[\"tuple\"]\n        assert result[\"set\"] == obj[\"set\"]\n        assert result[\"frozenset\"] == obj[\"frozenset\"]\n        assert result[\"nested_list\"] == obj[\"nested_list\"]\n        assert result[\"list_of_tuples\"] == obj[\"list_of_tuples\"]\n\n\nclass TestErrorHandling:\n    \"\"\"Test error handling and malformed data.\"\"\"\n\n    def test_invalid_json_data(self):\n        \"\"\"Test handling of invalid JSON data.\"\"\"\n        with pytest.raises((json.JSONDecodeError, SerializationError)):\n            SafeSerializer.loads(\"safe:invalid json\", allow_pickle=False)\n\n    def test_corrupted_safe_prefix(self):\n        \"\"\"Test handling of data with safe prefix but invalid JSON.\"\"\"\n        with pytest.raises((json.JSONDecodeError, SerializationError)):\n            SafeSerializer.loads(\"safe:{broken\", allow_pickle=False)\n\n    def test_missing_type_field(self):\n        \"\"\"Test handling of malformed type markers.\"\"\"\n        # Valid JSON but missing required fields\n        malformed = \"safe:\" + json.dumps({\"data\": [1, 2, 3]})  # Missing __type__\n\n        # Should still work as regular dict\n        result = SafeSerializer.loads(malformed, allow_pickle=False)\n        assert result == {\"data\": [1, 2, 3]}\n\n    def test_unknown_type_marker(self):\n        \"\"\"Test handling of unknown type markers.\"\"\"\n        unknown_type = \"safe:\" + json.dumps({\"__type__\": \"unknown_type\", \"data\": \"something\"})\n\n        # Should return as dict with type marker\n        result = SafeSerializer.loads(unknown_type, allow_pickle=False)\n        assert \"__type__\" in result\n\n    def test_invalid_base64_in_bytes(self):\n        \"\"\"Test handling of invalid base64 in bytes type.\"\"\"\n        invalid_bytes = \"safe:\" + json.dumps({\"__type__\": \"bytes\", \"data\": \"not-valid-base64!!!\"})\n\n        with pytest.raises(Exception):  # Will raise base64 decode error\n            SafeSerializer.loads(invalid_bytes, allow_pickle=False)\n\n    def test_serialization_of_none_type(self):\n        \"\"\"Test that None type is handled correctly.\"\"\"\n        obj = None\n        serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n        result = SafeSerializer.loads(serialized, allow_pickle=False)\n        assert result is None\n\n    def test_serialization_of_function(self):\n        \"\"\"Test that functions cannot be serialized safely.\"\"\"\n\n        def my_function():\n            pass\n\n        with pytest.raises(SerializationError):\n            SafeSerializer.dumps(my_function, allow_pickle=False)\n\n    def test_serialization_of_class(self):\n        \"\"\"Test that classes cannot be serialized safely.\"\"\"\n\n        class MyClass:\n            pass\n\n        with pytest.raises(SerializationError):\n            SafeSerializer.dumps(MyClass, allow_pickle=False)\n\n    def test_serialization_of_module(self):\n        \"\"\"Test that modules cannot be serialized safely.\"\"\"\n        import os\n\n        with pytest.raises(SerializationError):\n            SafeSerializer.dumps(os, allow_pickle=False)\n\n\nclass TestTypeCoverage:\n    \"\"\"Test all supported types comprehensively.\"\"\"\n\n    def test_all_datetime_types(self):\n        \"\"\"Test all datetime module types.\"\"\"\n        from datetime import date, datetime, time\n\n        test_cases = [\n            datetime(2024, 1, 1, 12, 30, 45),\n            date(2024, 1, 1),\n            time(12, 30, 45),\n            timedelta(days=5, hours=3, minutes=30, seconds=15),\n            datetime.min,\n            datetime.max,\n            date.min,\n            date.max,\n        ]\n\n        for obj in test_cases:\n            serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n            result = SafeSerializer.loads(serialized, allow_pickle=False)\n            assert result == obj\n\n    def test_decimal_precision(self):\n        \"\"\"Test Decimal type with various precisions.\"\"\"\n        from decimal import getcontext\n\n        # Set high precision\n        getcontext().prec = 50\n\n        test_cases = [\n            Decimal(\"3.14159265358979323846264338327950288419716939937510\"),\n            Decimal(\"0.1\") + Decimal(\"0.2\"),  # Famous float precision issue\n            Decimal(\"1e-100\"),\n            Decimal(\"1e100\"),\n        ]\n\n        for obj in test_cases:\n            serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n            result = SafeSerializer.loads(serialized, allow_pickle=False)\n            assert result == obj\n\n    def test_pathlib_types(self):\n        \"\"\"Test various Path types.\"\"\"\n\n        test_cases = [\n            Path(\"/tmp/test.txt\"),\n            Path(\"relative/path/file.py\"),\n            Path(\"/\"),\n            Path(\".\"),\n            Path(\"..\"),\n            Path(\"/path/with spaces/file.txt\"),\n        ]\n\n        for obj in test_cases:\n            serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n            result = SafeSerializer.loads(serialized, allow_pickle=False)\n            assert result == obj\n\n    def test_complex_numbers(self):\n        \"\"\"Test complex number handling.\"\"\"\n        test_cases = [\n            complex(1, 2),\n            complex(0, 0),\n            complex(-5, 10),\n            complex(3.14, 2.71),\n            1 + 2j,\n            -5 + 10j,\n        ]\n\n        for obj in test_cases:\n            serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n            result = SafeSerializer.loads(serialized, allow_pickle=False)\n            assert result == obj\n\n    def test_bytes_types(self):\n        \"\"\"Test various bytes objects.\"\"\"\n        test_cases = [\n            b\"hello\",\n            b\"\\x00\\x01\\x02\\x03\",\n            b\"Binary\\xff\\xfe\\xfd\",\n            bytes(range(256)),  # All byte values\n            b\"\",  # Empty bytes\n        ]\n\n        for obj in test_cases:\n            serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n            result = SafeSerializer.loads(serialized, allow_pickle=False)\n            assert result == obj\n\n\nclass TestNumpySupport:\n    \"\"\"Test numpy array serialization (optional, skip if not installed).\"\"\"\n\n    def test_numpy_array(self):\n        \"\"\"Test numpy array roundtrip.\"\"\"\n        pytest.importorskip(\"numpy\")\n        import numpy as np\n\n        arr = np.array([[1, 2], [3, 4]], dtype=np.float32)\n\n        serialized = SafeSerializer.dumps(arr, allow_pickle=False)\n        result = SafeSerializer.loads(serialized, allow_pickle=False)\n\n        np.testing.assert_array_equal(result, arr)\n        assert result.dtype == arr.dtype\n\n    def test_numpy_scalars(self):\n        \"\"\"Test numpy scalar types.\"\"\"\n        pytest.importorskip(\"numpy\")\n        import numpy as np\n\n        test_cases = [\n            np.int32(42),\n            np.float64(3.14),\n        ]\n\n        for obj in test_cases:\n            serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n            result = SafeSerializer.loads(serialized, allow_pickle=False)\n            assert result == obj.item()\n\n    def test_numpy_various_dtypes(self):\n        \"\"\"Test numpy arrays with various dtypes.\"\"\"\n        pytest.importorskip(\"numpy\")\n        import numpy as np\n\n        # Test numeric dtypes (non-complex)\n        dtypes = [\n            np.int8,\n            np.int16,\n            np.int32,\n            np.int64,\n            np.uint8,\n            np.uint16,\n            np.uint32,\n            np.uint64,\n            np.float16,\n            np.float32,\n            np.float64,\n            np.bool_,\n        ]\n\n        for dtype in dtypes:\n            arr = np.array([1, 2, 3], dtype=dtype)\n            serialized = SafeSerializer.dumps(arr, allow_pickle=False)\n            result = SafeSerializer.loads(serialized, allow_pickle=False)\n            np.testing.assert_array_equal(result, arr)\n            assert result.dtype == arr.dtype\n\n        # Complex dtypes need special handling - test separately\n        # Note: numpy complex arrays are not fully supported in safe mode\n        # as they require custom complex number serialization\n\n    def test_numpy_multidimensional(self):\n        \"\"\"Test multidimensional numpy arrays.\"\"\"\n        pytest.importorskip(\"numpy\")\n        import numpy as np\n\n        test_cases = [\n            np.array([[1, 2], [3, 4]]),  # 2D\n            np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]),  # 3D\n            np.zeros((10, 10, 10)),  # Large 3D\n            np.ones((5, 5)),  # 2D ones\n        ]\n\n        for arr in test_cases:\n            serialized = SafeSerializer.dumps(arr, allow_pickle=False)\n            result = SafeSerializer.loads(serialized, allow_pickle=False)\n            np.testing.assert_array_equal(result, arr)\n\n    def test_numpy_empty_array(self):\n        \"\"\"Test empty numpy array.\"\"\"\n        pytest.importorskip(\"numpy\")\n        import numpy as np\n\n        arr = np.array([])\n        serialized = SafeSerializer.dumps(arr, allow_pickle=False)\n        result = SafeSerializer.loads(serialized, allow_pickle=False)\n        np.testing.assert_array_equal(result, arr)\n\n\nclass TestPILSupport:\n    \"\"\"Test PIL Image serialization (optional, skip if not installed).\"\"\"\n\n    def test_pil_image(self):\n        \"\"\"Test PIL Image roundtrip.\"\"\"\n        pytest.importorskip(\"PIL\")\n        from PIL import Image\n\n        # Create a simple test image\n        img = Image.new(\"RGB\", (10, 10), color=\"red\")\n\n        serialized = SafeSerializer.dumps(img, allow_pickle=False)\n        result = SafeSerializer.loads(serialized, allow_pickle=False)\n\n        assert isinstance(result, Image.Image)\n        assert result.size == img.size\n        assert result.mode == img.mode\n\n    def test_pil_various_modes(self):\n        \"\"\"Test PIL images in various modes.\"\"\"\n        pytest.importorskip(\"PIL\")\n        from PIL import Image\n\n        modes = [\"RGB\", \"RGBA\", \"L\", \"1\"]  # Color, Alpha, Grayscale, Binary\n\n        for mode in modes:\n            img = Image.new(mode, (10, 10), color=\"red\" if mode != \"1\" else 1)\n            serialized = SafeSerializer.dumps(img, allow_pickle=False)\n            result = SafeSerializer.loads(serialized, allow_pickle=False)\n\n            assert isinstance(result, Image.Image)\n            assert result.mode == img.mode\n            assert result.size == img.size\n\n    def test_pil_various_sizes(self):\n        \"\"\"Test PIL images of various sizes.\"\"\"\n        pytest.importorskip(\"PIL\")\n        from PIL import Image\n\n        sizes = [(1, 1), (100, 100), (500, 300)]\n\n        for size in sizes:\n            img = Image.new(\"RGB\", size, color=\"blue\")\n            serialized = SafeSerializer.dumps(img, allow_pickle=False)\n            result = SafeSerializer.loads(serialized, allow_pickle=False)\n\n            assert result.size == img.size\n\n\nclass TestDataclasses:\n    \"\"\"Test dataclass serialization.\"\"\"\n\n    def test_simple_dataclass(self):\n        \"\"\"Test simple dataclass.\"\"\"\n        from dataclasses import dataclass\n\n        @dataclass\n        class Person:\n            name: str\n            age: int\n\n        person = Person(name=\"Alice\", age=30)\n        serialized = SafeSerializer.dumps(person, allow_pickle=False)\n        result = SafeSerializer.loads(serialized, allow_pickle=False)\n\n        # Result is a dict representation\n        assert result[\"__dataclass__\"] == \"Person\"\n        assert result[\"name\"] == \"Alice\"\n        assert result[\"age\"] == 30\n\n    def test_nested_dataclass(self):\n        \"\"\"Test nested dataclasses.\"\"\"\n        from dataclasses import dataclass\n\n        @dataclass\n        class Address:\n            street: str\n            city: str\n\n        @dataclass\n        class Person:\n            name: str\n            address: Address\n\n        person = Person(name=\"Bob\", address=Address(street=\"123 Main St\", city=\"NYC\"))\n        serialized = SafeSerializer.dumps(person, allow_pickle=False)\n        result = SafeSerializer.loads(serialized, allow_pickle=False)\n\n        assert result[\"name\"] == \"Bob\"\n        assert result[\"address\"][\"street\"] == \"123 Main St\"\n\n\nclass TestPerformance:\n    \"\"\"Performance tests for large data.\"\"\"\n\n    def test_large_list(self):\n        \"\"\"Test serialization of large list.\"\"\"\n        large_list = list(range(100_000))\n\n        serialized = SafeSerializer.dumps(large_list, allow_pickle=False)\n        result = SafeSerializer.loads(serialized, allow_pickle=False)\n\n        assert result == large_list\n\n    def test_large_dict(self):\n        \"\"\"Test serialization of large dictionary.\"\"\"\n        large_dict = {f\"key_{i}\": i for i in range(10_000)}\n\n        serialized = SafeSerializer.dumps(large_dict, allow_pickle=False)\n        result = SafeSerializer.loads(serialized, allow_pickle=False)\n\n        assert result == large_dict\n\n    def test_deeply_nested_performance(self):\n        \"\"\"Test performance with deeply nested structures.\"\"\"\n        obj = {\"level\": 0}\n        current = obj\n        for i in range(1, 100):  # 100 levels (avoid recursion limit)\n            current[\"nested\"] = {\"level\": i}\n            current = current[\"nested\"]\n\n        serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n        result = SafeSerializer.loads(serialized, allow_pickle=False)\n\n        assert result == obj\n\n\nclass TestPrefixHandling:\n    \"\"\"Test handling of different prefix formats.\"\"\"\n\n    def test_safe_prefix_detection(self):\n        \"\"\"Test detection of safe: prefix.\"\"\"\n        obj = {\"test\": \"data\"}\n        serialized = SafeSerializer.dumps(obj, allow_pickle=False)\n\n        assert serialized.startswith(\"safe:\")\n        result = SafeSerializer.loads(serialized, allow_pickle=False)\n        assert result == obj\n\n    def test_pickle_prefix_with_allow_pickle(self):\n        \"\"\"Test pickle: prefix when pickle is allowed.\"\"\"\n        # Create an object that needs pickle\n        obj = PicklableCustomClass()\n        serialized = SafeSerializer.dumps(obj, allow_pickle=True)\n\n        # Should have pickle prefix\n        assert serialized.startswith(\"pickle:\")\n\n        result = SafeSerializer.loads(serialized, allow_pickle=True)\n        assert result.value == 42\n\n    def test_legacy_format_detection(self):\n        \"\"\"Test detection and handling of legacy format (no prefix).\"\"\"\n        # Simulate legacy pickle data (no prefix)\n        legacy_data = {\"key\": \"value\"}\n        legacy_encoded = base64.b64encode(pickle.dumps(legacy_data)).decode()\n\n        # Should work with allow_pickle=True\n        result = SafeSerializer.loads(legacy_encoded, allow_pickle=True)\n        assert result == legacy_data\n\n\nclass TestRealWorldScenarios:\n    \"\"\"Test real-world usage scenarios.\"\"\"\n\n    def test_agent_variables_scenario(self):\n        \"\"\"Test typical agent variables scenario.\"\"\"\n        import numpy as np\n        from PIL import Image\n\n        # Typical variables an agent might use\n        variables = {\n            \"search_results\": [\"result1\", \"result2\", \"result3\"],\n            \"config\": {\n                \"temperature\": 0.7,\n                \"max_tokens\": 100,\n                \"model\": \"gpt-4\",\n            },\n            \"data_array\": np.array([1.0, 2.0, 3.0]),\n            \"image\": Image.new(\"RGB\", (50, 50)),\n            \"timestamp\": datetime.now(),\n            \"status\": \"running\",\n            \"counter\": 42,\n        }\n\n        serialized = SafeSerializer.dumps(variables, allow_pickle=False)\n        result = SafeSerializer.loads(serialized, allow_pickle=False)\n\n        assert result[\"search_results\"] == variables[\"search_results\"]\n        assert result[\"config\"] == variables[\"config\"]\n        assert result[\"status\"] == variables[\"status\"]\n        assert result[\"counter\"] == variables[\"counter\"]\n\n    def test_final_answer_scenario(self):\n        \"\"\"Test typical final answer serialization.\"\"\"\n        final_answers = [\n            \"Simple string answer\",\n            {\"answer\": \"structured\", \"confidence\": 0.95},\n            [\"multiple\", \"results\", \"returned\"],\n            42,\n            3.14159,\n            True,\n        ]\n\n        for answer in final_answers:\n            serialized = SafeSerializer.dumps(answer, allow_pickle=False)\n            result = SafeSerializer.loads(serialized, allow_pickle=False)\n            assert result == answer\n\n\nclass TestGeneratedDeserializerCode:\n    \"\"\"Regression tests for generated deserializer code used by remote executors.\"\"\"\n\n    def test_generated_deserializer_executes_for_safe_payload(self):\n        code = SafeSerializer.get_deserializer_code(allow_pickle=False)\n        namespace = {}\n        exec(code, namespace, namespace)\n\n        payload = SafeSerializer.dumps(\n            {\n                \"count\": 3,\n                \"items\": (1, 2, 3),\n                \"raw\": b\"bytes\",\n            },\n            allow_pickle=False,\n        )\n        result = namespace[\"_deserialize\"](payload)\n        assert result == {\"count\": 3, \"items\": (1, 2, 3), \"raw\": b\"bytes\"}\n\n    def test_generated_deserializer_handles_pickle_prefix_when_enabled(self):\n        code = SafeSerializer.get_deserializer_code(allow_pickle=True)\n        namespace = {}\n        exec(code, namespace, namespace)\n\n        payload = \"pickle:\" + base64.b64encode(pickle.dumps({\"hello\": \"world\"})).decode()\n        result = namespace[\"_deserialize\"](payload)\n        assert result == {\"hello\": \"world\"}\n\n\nclass TestConcurrency:\n    \"\"\"Test thread safety and concurrent access.\"\"\"\n\n    def test_concurrent_serialization(self):\n        \"\"\"Test concurrent serialization operations.\"\"\"\n        import threading\n\n        results = []\n        errors = []\n\n        def serialize_data(data, index):\n            try:\n                serialized = SafeSerializer.dumps(data, allow_pickle=False)\n                deserialized = SafeSerializer.loads(serialized, allow_pickle=False)\n                results.append((index, deserialized == data))\n            except Exception as e:\n                errors.append((index, e))\n\n        threads = []\n        test_data = [{\"thread\": i, \"data\": list(range(100))} for i in range(10)]\n\n        for i, data in enumerate(test_data):\n            thread = threading.Thread(target=serialize_data, args=(data, i))\n            threads.append(thread)\n            thread.start()\n\n        for thread in threads:\n            thread.join()\n\n        assert len(errors) == 0, f\"Errors occurred: {errors}\"\n        assert len(results) == 10\n        assert all(success for _, success in results)\n"
  },
  {
    "path": "tests/test_telemetry.py",
    "content": "# coding=utf-8\n# Copyright 2025 HuggingFace Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\n# Source: https://github.com/Arize-ai/openinference/blob/main/python/instrumentation/openinference-instrumentation-smolagents/tests/openinference/instrumentation/smolagents/test_instrumentor.py\n\nfrom typing import Generator\n\nimport pytest\n\nfrom .utils.markers import require_run_all\n\n\n# Add this at the module level to skip all tests if OpenTelemetry is not available\npytest.importorskip(\"opentelemetry\", reason=\"requires opentelemetry\")\npytest.importorskip(\n    \"openinference.instrumentation.smolagents\", reason=\"requires openinference.instrumentation.smolagents\"\n)\n\nfrom openinference.instrumentation.smolagents import SmolagentsInstrumentor\nfrom opentelemetry import trace as trace_api\nfrom opentelemetry.sdk import trace as trace_sdk\nfrom opentelemetry.sdk.resources import Resource\nfrom opentelemetry.sdk.trace.export import SimpleSpanProcessor\nfrom opentelemetry.sdk.trace.export.in_memory_span_exporter import InMemorySpanExporter\n\nfrom smolagents.models import InferenceClientModel\n\n\n@pytest.fixture\ndef in_memory_span_exporter() -> InMemorySpanExporter:\n    return InMemorySpanExporter()\n\n\n@pytest.fixture\ndef tracer_provider(in_memory_span_exporter: InMemorySpanExporter) -> trace_api.TracerProvider:\n    resource = Resource(attributes={})\n    tracer_provider = trace_sdk.TracerProvider(resource=resource)\n    span_processor = SimpleSpanProcessor(span_exporter=in_memory_span_exporter)\n    tracer_provider.add_span_processor(span_processor=span_processor)\n    return tracer_provider\n\n\n@pytest.fixture(autouse=True)\ndef instrument(\n    tracer_provider: trace_api.TracerProvider,\n    in_memory_span_exporter: InMemorySpanExporter,\n) -> Generator[None, None, None]:\n    SmolagentsInstrumentor().instrument(tracer_provider=tracer_provider, skip_dep_check=True)\n    yield\n    SmolagentsInstrumentor().uninstrument()\n    in_memory_span_exporter.clear()\n\n\n@require_run_all\nclass TestOpenTelemetry:\n    def test_model(self, in_memory_span_exporter: InMemorySpanExporter):\n        model = InferenceClientModel()\n        _ = model(\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": [\n                        {\n                            \"type\": \"text\",\n                            \"text\": \"Who won the World Cup in 2018? Answer in one word with no punctuation.\",\n                        }\n                    ],\n                }\n            ]\n        )\n        spans = in_memory_span_exporter.get_finished_spans()\n        assert len(spans) == 1\n        span = spans[0]\n        assert span.name == \"InferenceClientModel.generate\"\n        assert span.status.is_ok\n        assert span.attributes\n"
  },
  {
    "path": "tests/test_tool_validation.py",
    "content": "import ast\nfrom textwrap import dedent\n\nimport pytest\n\nfrom smolagents.default_tools import (\n    DuckDuckGoSearchTool,\n    GoogleSearchTool,\n    SpeechToTextTool,\n    VisitWebpageTool,\n    WebSearchTool,\n)\nfrom smolagents.tool_validation import MethodChecker, validate_tool_attributes\nfrom smolagents.tools import Tool, tool\n\n\nUNDEFINED_VARIABLE = \"undefined_variable\"\n\n\n@pytest.mark.parametrize(\n    \"tool_class\", [DuckDuckGoSearchTool, GoogleSearchTool, SpeechToTextTool, VisitWebpageTool, WebSearchTool]\n)\ndef test_validate_tool_attributes_with_default_tools(tool_class):\n    assert validate_tool_attributes(tool_class) is None, f\"failed for {tool_class.name} tool\"\n\n\nclass ValidTool(Tool):\n    name = \"valid_tool\"\n    description = \"A valid tool\"\n    inputs = {\"input\": {\"type\": \"string\", \"description\": \"input\"}}\n    output_type = \"string\"\n    simple_attr = \"string\"\n    dict_attr = {\"key\": \"value\"}\n\n    def __init__(self, optional_param=\"default\"):\n        super().__init__()\n        self.param = optional_param\n\n    def forward(self, input: str) -> str:\n        return input.upper()\n\n\n@tool\ndef valid_tool_function(input: str) -> str:\n    \"\"\"A valid tool function.\n\n    Args:\n        input (str): Input string.\n    \"\"\"\n    return input.upper()\n\n\n@pytest.mark.parametrize(\"tool_class\", [ValidTool, valid_tool_function.__class__])\ndef test_validate_tool_attributes_valid(tool_class):\n    assert validate_tool_attributes(tool_class) is None\n\n\nclass InvalidToolName(Tool):\n    name = \"invalid tool name\"\n    description = \"Tool with invalid name\"\n    inputs = {\"input\": {\"type\": \"string\", \"description\": \"input\"}}\n    output_type = \"string\"\n\n    def __init__(self):\n        super().__init__()\n\n    def forward(self, input: str) -> str:\n        return input\n\n\nclass InvalidToolComplexAttrs(Tool):\n    name = \"invalid_tool\"\n    description = \"Tool with complex class attributes\"\n    inputs = {\"input\": {\"type\": \"string\", \"description\": \"input\"}}\n    output_type = \"string\"\n    complex_attr = [x for x in range(3)]  # Complex class attribute\n\n    def __init__(self):\n        super().__init__()\n\n    def forward(self, input: str) -> str:\n        return input\n\n\nclass InvalidToolRequiredParams(Tool):\n    name = \"invalid_tool\"\n    description = \"Tool with required params\"\n    inputs = {\"input\": {\"type\": \"string\", \"description\": \"input\"}}\n    output_type = \"string\"\n\n    def __init__(self, required_param, kwarg1=1):  # No default value\n        super().__init__()\n        self.param = required_param\n\n    def forward(self, input: str) -> str:\n        return input\n\n\nclass InvalidToolNonLiteralDefaultParam(Tool):\n    name = \"invalid_tool\"\n    description = \"Tool with non-literal default parameter value\"\n    inputs = {\"input\": {\"type\": \"string\", \"description\": \"input\"}}\n    output_type = \"string\"\n\n    def __init__(self, default_param=UNDEFINED_VARIABLE):  # UNDEFINED_VARIABLE as default is non-literal\n        super().__init__()\n        self.default_param = default_param\n\n    def forward(self, input: str) -> str:\n        return input\n\n\nclass InvalidToolUndefinedNames(Tool):\n    name = \"invalid_tool\"\n    description = \"Tool with undefined names\"\n    inputs = {\"input\": {\"type\": \"string\", \"description\": \"input\"}}\n    output_type = \"string\"\n\n    def forward(self, input: str) -> str:\n        return UNDEFINED_VARIABLE  # Undefined name\n\n\n@pytest.mark.parametrize(\n    \"tool_class, expected_error\",\n    [\n        (\n            InvalidToolName,\n            \"Class attribute 'name' must be a valid Python identifier and not a reserved keyword, found 'invalid tool name'\",\n        ),\n        (InvalidToolComplexAttrs, \"Complex attributes should be defined in __init__, not as class attributes\"),\n        (InvalidToolRequiredParams, \"Parameters in __init__ must have default values, found required parameters\"),\n        (\n            InvalidToolNonLiteralDefaultParam,\n            \"Parameters in __init__ must have literal default values, found non-literal defaults\",\n        ),\n        (InvalidToolUndefinedNames, \"Name 'UNDEFINED_VARIABLE' is undefined\"),\n    ],\n)\ndef test_validate_tool_attributes_exceptions(tool_class, expected_error):\n    with pytest.raises(ValueError, match=expected_error):\n        validate_tool_attributes(tool_class)\n\n\nclass MultipleAssignmentsTool(Tool):\n    name = \"multiple_assignments_tool\"\n    description = \"Tool with multiple assignments\"\n    inputs = {\"input\": {\"type\": \"string\", \"description\": \"input\"}}\n    output_type = \"string\"\n\n    def __init__(self):\n        super().__init__()\n\n    def forward(self, input: str) -> str:\n        a, b = \"1\", \"2\"\n        return a + b\n\n\ndef test_validate_tool_attributes_multiple_assignments():\n    validate_tool_attributes(MultipleAssignmentsTool)\n\n\n@tool\ndef tool_function_with_multiple_assignments(input: str) -> str:\n    \"\"\"A valid tool function.\n\n    Args:\n        input (str): Input string.\n    \"\"\"\n    a, b = \"1\", \"2\"\n    return input.upper() + a + b\n\n\n@pytest.mark.parametrize(\"tool_instance\", [MultipleAssignmentsTool(), tool_function_with_multiple_assignments])\ndef test_tool_to_dict_validation_with_multiple_assignments(tool_instance):\n    tool_instance.to_dict()\n\n\nclass TestMethodChecker:\n    def test_multiple_assignments(self):\n        source_code = dedent(\n            \"\"\"\n            def forward(self) -> str:\n                a, b = \"1\", \"2\"\n                return a + b\n            \"\"\"\n        )\n        method_checker = MethodChecker(set())\n        method_checker.visit(ast.parse(source_code))\n        assert method_checker.errors == []\n"
  },
  {
    "path": "tests/test_tools.py",
    "content": "# coding=utf-8\n# Copyright 2024 HuggingFace Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\nimport inspect\nimport os\nimport warnings\nfrom textwrap import dedent\nfrom typing import Any, Literal\nfrom unittest.mock import MagicMock, patch\n\nimport mcp\nimport numpy as np\nimport PIL.Image\nimport pytest\n\nfrom smolagents.agent_types import _AGENT_TYPE_MAPPING\nfrom smolagents.tools import AUTHORIZED_TYPES, Tool, ToolCollection, launch_gradio_demo, tool, validate_tool_arguments\n\nfrom .utils.markers import require_run_all\n\n\nclass ToolTesterMixin:\n    def test_inputs_output(self):\n        assert hasattr(self.tool, \"inputs\")\n        assert hasattr(self.tool, \"output_type\")\n\n        inputs = self.tool.inputs\n        assert isinstance(inputs, dict)\n\n        for _, input_spec in inputs.items():\n            assert \"type\" in input_spec\n            assert \"description\" in input_spec\n            assert input_spec[\"type\"] in AUTHORIZED_TYPES\n            assert isinstance(input_spec[\"description\"], str)\n\n        output_type = self.tool.output_type\n        assert output_type in AUTHORIZED_TYPES\n\n    def test_common_attributes(self):\n        assert hasattr(self.tool, \"description\")\n        assert hasattr(self.tool, \"name\")\n        assert hasattr(self.tool, \"inputs\")\n        assert hasattr(self.tool, \"output_type\")\n\n    def test_agent_type_output(self, create_inputs):\n        inputs = create_inputs(self.tool.inputs)\n        output = self.tool(**inputs, sanitize_inputs_outputs=True)\n        if self.tool.output_type != \"any\":\n            agent_type = _AGENT_TYPE_MAPPING[self.tool.output_type]\n            assert isinstance(output, agent_type)\n\n    @pytest.fixture\n    def create_inputs(self, shared_datadir):\n        def _create_inputs(tool_inputs: dict[str, dict[str | type, str]]) -> dict[str, Any]:\n            inputs = {}\n\n            for input_name, input_desc in tool_inputs.items():\n                input_type = input_desc[\"type\"]\n\n                if input_type == \"string\":\n                    inputs[input_name] = \"Text input\"\n                elif input_type == \"image\":\n                    inputs[input_name] = PIL.Image.open(shared_datadir / \"000000039769.png\").resize((512, 512))\n                elif input_type == \"audio\":\n                    inputs[input_name] = np.ones(3000)\n                else:\n                    raise ValueError(f\"Invalid type requested: {input_type}\")\n\n            return inputs\n\n        return _create_inputs\n\n\nclass TestTool:\n    @pytest.mark.parametrize(\n        \"type_value, should_raise_error, error_contains\",\n        [\n            # Valid cases\n            (\"string\", False, None),\n            ([\"string\", \"number\"], False, None),\n            # Invalid cases\n            (\"invalid_type\", ValueError, \"must be one of\"),\n            ([\"string\", \"invalid_type\"], ValueError, \"must be one of\"),\n            ([123, \"string\"], TypeError, \"when type is a list, all elements must be strings\"),\n            (123, TypeError, \"must be a string or list of strings\"),\n        ],\n    )\n    def test_tool_input_type_validation(self, type_value, should_raise_error, error_contains):\n        \"\"\"Test the validation of the type property in tool inputs.\"\"\"\n\n        # Define a tool class with the test type value\n        def create_tool():\n            class TestTool(Tool):\n                name = \"test_tool\"\n                description = \"A tool for testing type validation\"\n                inputs = {\"text\": {\"type\": type_value, \"description\": \"Some input\"}}\n                output_type = \"string\"\n\n                def forward(self, text) -> str:\n                    return text\n\n            return TestTool()\n\n        # Check if we expect this to raise an exception\n        if should_raise_error:\n            with pytest.raises(should_raise_error) as exc_info:\n                create_tool()\n            # Verify the error message contains expected text\n            assert error_contains in str(exc_info.value)\n        else:\n            # Should not raise an exception\n            tool = create_tool()\n            assert isinstance(tool, Tool)\n\n    @pytest.mark.parametrize(\n        \"tool_fixture, expected_output\",\n        [\n            (\"no_input_tool\", 'def no_input_tool() -> string:\\n    \"\"\"Tool with no inputs\\n    \"\"\"'),\n            (\n                \"single_input_tool\",\n                'def single_input_tool(text: string) -> string:\\n    \"\"\"Tool with one input\\n\\n    Args:\\n        text: Input text\\n    \"\"\"',\n            ),\n            (\n                \"multi_input_tool\",\n                'def multi_input_tool(text: string, count: integer) -> object:\\n    \"\"\"Tool with multiple inputs\\n\\n    Args:\\n        text: Text input\\n        count: Number count\\n    \"\"\"',\n            ),\n            (\n                \"multiline_description_tool\",\n                'def multiline_description_tool(input: string) -> string:\\n    \"\"\"This is a tool with\\n    multiple lines\\n    in the description\\n\\n    Args:\\n        input: Some input\\n    \"\"\"',\n            ),\n        ],\n    )\n    def test_tool_to_code_prompt_output_format(self, tool_fixture, expected_output, request):\n        \"\"\"Test that to_code_prompt generates properly formatted and indented output.\"\"\"\n        tool = request.getfixturevalue(tool_fixture)\n        code_prompt = tool.to_code_prompt()\n        assert code_prompt == expected_output\n\n    @pytest.mark.parametrize(\n        \"tool_fixture, expected_output\",\n        [\n            (\n                \"no_input_tool\",\n                \"no_input_tool: Tool with no inputs\\n    Takes inputs: {}\\n    Returns an output of type: string\",\n            ),\n            (\n                \"single_input_tool\",\n                \"single_input_tool: Tool with one input\\n    Takes inputs: {'text': {'type': 'string', 'description': 'Input text'}}\\n    Returns an output of type: string\",\n            ),\n            (\n                \"multi_input_tool\",\n                \"multi_input_tool: Tool with multiple inputs\\n    Takes inputs: {'text': {'type': 'string', 'description': 'Text input'}, 'count': {'type': 'integer', 'description': 'Number count'}}\\n    Returns an output of type: object\",\n            ),\n            (\n                \"multiline_description_tool\",\n                \"multiline_description_tool: This is a tool with\\nmultiple lines\\nin the description\\n    Takes inputs: {'input': {'type': 'string', 'description': 'Some input'}}\\n    Returns an output of type: string\",\n            ),\n        ],\n    )\n    def test_tool_to_tool_calling_prompt_output_format(self, tool_fixture, expected_output, request):\n        \"\"\"Test that to_tool_calling_prompt generates properly formatted output.\"\"\"\n        tool = request.getfixturevalue(tool_fixture)\n        tool_calling_prompt = tool.to_tool_calling_prompt()\n        assert tool_calling_prompt == expected_output\n\n    def test_tool_init_with_decorator(self):\n        @tool\n        def coolfunc(a: str, b: int) -> float:\n            \"\"\"Cool function\n\n            Args:\n                a: The first argument\n                b: The second one\n            \"\"\"\n            return b + 2, a\n\n        assert coolfunc.output_type == \"number\"\n\n    def test_tool_init_vanilla(self):\n        class HFModelDownloadsTool(Tool):\n            name = \"model_download_counter\"\n            description = \"\"\"\n            This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub.\n            It returns the name of the checkpoint.\"\"\"\n\n            inputs = {\n                \"task\": {\n                    \"type\": \"string\",\n                    \"description\": \"the task category (such as text-classification, depth-estimation, etc)\",\n                }\n            }\n            output_type = \"string\"\n\n            def forward(self, task: str) -> str:\n                return \"best model\"\n\n        tool = HFModelDownloadsTool()\n        assert list(tool.inputs.keys())[0] == \"task\"\n\n    def test_tool_init_decorator_raises_issues(self):\n        with pytest.raises(Exception) as e:\n\n            @tool\n            def coolfunc(a: str, b: int):\n                \"\"\"Cool function\n\n                Args:\n                    a: The first argument\n                    b: The second one\n                \"\"\"\n                return a + b\n\n            assert coolfunc.output_type == \"number\"\n        assert \"Tool return type not found\" in str(e)\n\n        with pytest.raises(Exception) as e:\n\n            @tool\n            def coolfunc(a: str, b: int) -> int:\n                \"\"\"Cool function\n\n                Args:\n                    a: The first argument\n                \"\"\"\n                return b + a\n\n            assert coolfunc.output_type == \"number\"\n        assert \"docstring has no description for the argument\" in str(e)\n\n    def test_saving_tool_raises_error_imports_outside_function(self, tmp_path):\n        with pytest.raises(Exception) as e:\n            import numpy as np\n\n            @tool\n            def get_current_time() -> str:\n                \"\"\"\n                Gets the current time.\n                \"\"\"\n                return str(np.random.random())\n\n            get_current_time.save(tmp_path)\n\n        assert \"np\" in str(e)\n\n        # Also test with classic definition\n        with pytest.raises(Exception) as e:\n\n            class GetCurrentTimeTool(Tool):\n                name = \"get_current_time_tool\"\n                description = \"Gets the current time\"\n                inputs = {}\n                output_type = \"string\"\n\n                def forward(self):\n                    return str(np.random.random())\n\n            get_current_time = GetCurrentTimeTool()\n            get_current_time.save(tmp_path)\n\n        assert \"np\" in str(e)\n\n    def test_tool_definition_raises_no_error_imports_in_function(self):\n        @tool\n        def get_current_time() -> str:\n            \"\"\"\n            Gets the current time.\n            \"\"\"\n            from datetime import datetime\n\n            return str(datetime.now())\n\n        class GetCurrentTimeTool(Tool):\n            name = \"get_current_time_tool\"\n            description = \"Gets the current time\"\n            inputs = {}\n            output_type = \"string\"\n\n            def forward(self):\n                from datetime import datetime\n\n                return str(datetime.now())\n\n    def test_tool_to_dict_allows_no_arg_in_init(self):\n        \"\"\"Test that a tool cannot be saved with required args in init\"\"\"\n\n        class FailTool(Tool):\n            name = \"specific\"\n            description = \"test description\"\n            inputs = {\"string_input\": {\"type\": \"string\", \"description\": \"input description\"}}\n            output_type = \"string\"\n\n            def __init__(self, url):\n                super().__init__(self)\n                self.url = url\n\n            def forward(self, string_input: str) -> str:\n                return self.url + string_input\n\n        fail_tool = FailTool(\"dummy_url\")\n        with pytest.raises(Exception) as e:\n            fail_tool.to_dict()\n        assert \"Parameters in __init__ must have default values, found required parameters\" in str(e)\n\n        class PassTool(Tool):\n            name = \"specific\"\n            description = \"test description\"\n            inputs = {\"string_input\": {\"type\": \"string\", \"description\": \"input description\"}}\n            output_type = \"string\"\n\n            def __init__(self, url: str | None = \"none\"):\n                super().__init__(self)\n                self.url = url\n\n            def forward(self, string_input: str) -> str:\n                return self.url + string_input\n\n        fail_tool = PassTool()\n        fail_tool.to_dict()\n\n    def test_saving_tool_allows_no_imports_from_outside_methods(self, tmp_path):\n        # Test that using imports from outside functions fails\n        import numpy as np\n\n        class FailTool(Tool):\n            name = \"specific\"\n            description = \"test description\"\n            inputs = {\"string_input\": {\"type\": \"string\", \"description\": \"input description\"}}\n            output_type = \"string\"\n\n            def useless_method(self):\n                self.client = np.random.random()\n                return \"\"\n\n            def forward(self, string_input):\n                return self.useless_method() + string_input\n\n        fail_tool = FailTool()\n        with pytest.raises(Exception) as e:\n            fail_tool.save(tmp_path)\n        assert \"'np' is undefined\" in str(e)\n\n        # Test that putting these imports inside functions works\n        class SuccessTool(Tool):\n            name = \"specific\"\n            description = \"test description\"\n            inputs = {\"string_input\": {\"type\": \"string\", \"description\": \"input description\"}}\n            output_type = \"string\"\n\n            def useless_method(self):\n                import numpy as np\n\n                self.client = np.random.random()\n                return \"\"\n\n            def forward(self, string_input):\n                return self.useless_method() + string_input\n\n        success_tool = SuccessTool()\n        success_tool.save(tmp_path)\n\n    def test_tool_missing_class_attributes_raises_error(self):\n        with pytest.raises(Exception) as e:\n\n            class GetWeatherTool(Tool):\n                name = \"get_weather\"\n                description = \"Get weather in the next days at given location.\"\n                inputs = {\n                    \"location\": {\"type\": \"string\", \"description\": \"the location\"},\n                    \"celsius\": {\n                        \"type\": \"string\",\n                        \"description\": \"the temperature type\",\n                    },\n                }\n\n                def forward(self, location: str, celsius: bool | None = False) -> str:\n                    return \"The weather is UNGODLY with torrential rains and temperatures below -10°C\"\n\n            GetWeatherTool()\n        assert \"You must set an attribute output_type\" in str(e)\n\n    def test_tool_from_decorator_optional_args(self):\n        @tool\n        def get_weather(location: str, celsius: bool | None = False) -> str:\n            \"\"\"\n            Get weather in the next days at given location.\n            Secretly this tool does not care about the location, it hates the weather everywhere.\n\n            Args:\n                location: the location\n                celsius: the temperature type\n            \"\"\"\n            return \"The weather is UNGODLY with torrential rains and temperatures below -10°C\"\n\n        assert \"nullable\" in get_weather.inputs[\"celsius\"]\n        assert get_weather.inputs[\"celsius\"][\"nullable\"]\n        assert \"nullable\" not in get_weather.inputs[\"location\"]\n\n    def test_tool_mismatching_nullable_args_raises_error(self):\n        with pytest.raises(Exception) as e:\n\n            class GetWeatherTool(Tool):\n                name = \"get_weather\"\n                description = \"Get weather in the next days at given location.\"\n                inputs = {\n                    \"location\": {\"type\": \"string\", \"description\": \"the location\"},\n                    \"celsius\": {\n                        \"type\": \"string\",\n                        \"description\": \"the temperature type\",\n                    },\n                }\n                output_type = \"string\"\n\n                def forward(self, location: str, celsius: bool | None = False) -> str:\n                    return \"The weather is UNGODLY with torrential rains and temperatures below -10°C\"\n\n            GetWeatherTool()\n        assert \"Nullable\" in str(e)\n\n        with pytest.raises(Exception) as e:\n\n            class GetWeatherTool2(Tool):\n                name = \"get_weather\"\n                description = \"Get weather in the next days at given location.\"\n                inputs = {\n                    \"location\": {\"type\": \"string\", \"description\": \"the location\"},\n                    \"celsius\": {\n                        \"type\": \"string\",\n                        \"description\": \"the temperature type\",\n                    },\n                }\n                output_type = \"string\"\n\n                def forward(self, location: str, celsius: bool = False) -> str:\n                    return \"The weather is UNGODLY with torrential rains and temperatures below -10°C\"\n\n            GetWeatherTool2()\n        assert \"Nullable\" in str(e)\n\n        with pytest.raises(Exception) as e:\n\n            class GetWeatherTool3(Tool):\n                name = \"get_weather\"\n                description = \"Get weather in the next days at given location.\"\n                inputs = {\n                    \"location\": {\"type\": \"string\", \"description\": \"the location\"},\n                    \"celsius\": {\n                        \"type\": \"string\",\n                        \"description\": \"the temperature type\",\n                        \"nullable\": True,\n                    },\n                }\n                output_type = \"string\"\n\n                def forward(self, location, celsius: str) -> str:\n                    return \"The weather is UNGODLY with torrential rains and temperatures below -10°C\"\n\n            GetWeatherTool3()\n        assert \"Nullable\" in str(e)\n\n    def test_tool_default_parameters_is_nullable(self):\n        @tool\n        def get_weather(location: str, celsius: bool = False) -> str:\n            \"\"\"\n            Get weather in the next days at given location.\n\n            Args:\n                location: The location to get the weather for.\n                celsius: is the temperature given in celsius?\n            \"\"\"\n            return \"The weather is UNGODLY with torrential rains and temperatures below -10°C\"\n\n        assert get_weather.inputs[\"celsius\"][\"nullable\"]\n\n    def test_tool_supports_any_none(self, tmp_path):\n        @tool\n        def get_weather(location: Any) -> None:\n            \"\"\"\n            Get weather in the next days at given location.\n\n            Args:\n                location: The location to get the weather for.\n            \"\"\"\n            return\n\n        get_weather.save(tmp_path)\n        assert get_weather.inputs[\"location\"][\"type\"] == \"any\"\n        assert get_weather.output_type == \"null\"\n\n    def test_tool_supports_array(self):\n        @tool\n        def get_weather(locations: list[str], months: tuple[str, str] | None = None) -> dict[str, float]:\n            \"\"\"\n            Get weather in the next days at given locations.\n\n            Args:\n                locations: The locations to get the weather for.\n                months: The months to get the weather for\n            \"\"\"\n            return\n\n        assert get_weather.inputs[\"locations\"][\"type\"] == \"array\"\n        assert get_weather.inputs[\"months\"][\"type\"] == \"array\"\n\n    def test_tool_supports_string_literal(self):\n        @tool\n        def get_weather(unit: Literal[\"celsius\", \"fahrenheit\"] = \"celsius\") -> None:\n            \"\"\"\n            Get weather in the next days at given location.\n\n            Args:\n                unit: The unit of temperature\n            \"\"\"\n            return\n\n        assert get_weather.inputs[\"unit\"][\"type\"] == \"string\"\n        assert get_weather.inputs[\"unit\"][\"enum\"] == [\"celsius\", \"fahrenheit\"]\n\n    def test_tool_supports_numeric_literal(self):\n        @tool\n        def get_choice(choice: Literal[1, 2, 3]) -> None:\n            \"\"\"\n            Get choice based on the provided numeric literal.\n\n            Args:\n                choice: The numeric choice to be made.\n            \"\"\"\n            return\n\n        assert get_choice.inputs[\"choice\"][\"type\"] == \"integer\"\n        assert get_choice.inputs[\"choice\"][\"enum\"] == [1, 2, 3]\n\n    def test_tool_supports_nullable_literal(self):\n        @tool\n        def get_choice(choice: Literal[1, 2, 3, None]) -> None:\n            \"\"\"\n            Get choice based on the provided value.\n\n            Args:\n                choice: The numeric choice to be made.\n            \"\"\"\n            return\n\n        assert get_choice.inputs[\"choice\"][\"type\"] == \"integer\"\n        assert get_choice.inputs[\"choice\"][\"nullable\"] is True\n        assert get_choice.inputs[\"choice\"][\"enum\"] == [1, 2, 3]\n\n    def test_saving_tool_produces_valid_pyhon_code_with_multiline_description(self, tmp_path):\n        @tool\n        def get_weather(location: Any) -> None:\n            \"\"\"\n            Get weather in the next days at given location.\n            And works pretty well.\n\n            Args:\n                location: The location to get the weather for.\n            \"\"\"\n            return\n\n        get_weather.save(tmp_path)\n        with open(os.path.join(tmp_path, \"tool.py\"), \"r\", encoding=\"utf-8\") as f:\n            source_code = f.read()\n            compile(source_code, f.name, \"exec\")\n\n    @pytest.mark.parametrize(\"fixture_name\", [\"boolean_default_tool_class\", \"boolean_default_tool_function\"])\n    def test_to_dict_boolean_default_input(self, fixture_name, request):\n        \"\"\"Test that boolean input parameter with default value is correctly represented in to_dict output\"\"\"\n        tool = request.getfixturevalue(fixture_name)\n        result = tool.to_dict()\n        # Check that the boolean default annotation is preserved\n        assert \"flag: bool = False\" in result[\"code\"]\n        # Check nullable attribute is set for the parameter with default value\n        assert \"'nullable': True\" in result[\"code\"]\n\n    @pytest.mark.parametrize(\"fixture_name\", [\"optional_input_tool_class\", \"optional_input_tool_function\"])\n    def test_to_dict_optional_input(self, fixture_name, request):\n        \"\"\"Test that Optional/nullable input parameter is correctly represented in to_dict output\"\"\"\n        tool = request.getfixturevalue(fixture_name)\n        result = tool.to_dict()\n        # Check the Optional type annotation is preserved\n        assert \"optional_text: str | None = None\" in result[\"code\"]\n        # Check that the input is marked as nullable in the code\n        assert \"'nullable': True\" in result[\"code\"]\n\n    def test_from_dict_roundtrip(self, example_tool):\n        # Convert to dict\n        tool_dict = example_tool.to_dict()\n        # Create from dict\n        recreated_tool = Tool.from_dict(tool_dict)\n        # Verify properties\n        assert recreated_tool.name == example_tool.name\n        assert recreated_tool.description == example_tool.description\n        assert recreated_tool.inputs == example_tool.inputs\n        assert recreated_tool.output_type == example_tool.output_type\n        # Verify functionality\n        test_input = \"Hello, world!\"\n        assert recreated_tool(test_input) == test_input.upper()\n\n    def test_tool_from_dict_invalid(self):\n        # Missing code key\n        with pytest.raises(ValueError) as e:\n            Tool.from_dict({\"name\": \"invalid_tool\"})\n        assert \"must contain 'code' key\" in str(e)\n\n    def test_tool_decorator_preserves_original_function(self):\n        # Define a test function with type hints and docstring\n        def test_function(items: list[str]) -> str:\n            \"\"\"Join a list of strings.\n            Args:\n                items: A list of strings to join\n            Returns:\n                The joined string\n            \"\"\"\n            return \", \".join(items)\n\n        # Store original function signature, name, and source\n        original_signature = inspect.signature(test_function)\n        original_name = test_function.__name__\n        original_docstring = test_function.__doc__\n\n        # Create a tool from the function\n        test_tool = tool(test_function)\n\n        # Check that the original function is unchanged\n        assert original_signature == inspect.signature(test_function)\n        assert original_name == test_function.__name__\n        assert original_docstring == test_function.__doc__\n\n        # Verify that the tool's forward method has a different signature (it has 'self')\n        tool_forward_sig = inspect.signature(test_tool.forward)\n        assert list(tool_forward_sig.parameters.keys())[0] == \"self\"\n\n        # Original function should not have 'self' parameter\n        assert \"self\" not in original_signature.parameters\n\n    def test_tool_with_union_type_return(self):\n        @tool\n        def union_type_return_tool_function(param: int) -> str | bool:\n            \"\"\"\n            Tool with output union type.\n\n            Args:\n                param: Input parameter.\n            \"\"\"\n            return str(param) if param > 0 else False\n\n        assert isinstance(union_type_return_tool_function, Tool)\n        assert union_type_return_tool_function.output_type == \"any\"\n\n\nclass TestToolDecorator:\n    def test_tool_decorator_source_extraction_with_multiple_decorators(self):\n        \"\"\"Test that @tool correctly extracts source code with multiple decorators.\"\"\"\n\n        def dummy_decorator(func):\n            return func\n\n        with pytest.warns(UserWarning, match=\"has decorators other than @tool\"):\n\n            @tool\n            @dummy_decorator\n            def multi_decorator_tool(text: str) -> str:\n                \"\"\"Tool with multiple decorators.\n\n                Args:\n                    text: Input text\n                \"\"\"\n                return text.upper()\n\n        # Verify the tool works\n        assert isinstance(multi_decorator_tool, Tool)\n        assert multi_decorator_tool.name == \"multi_decorator_tool\"\n        assert multi_decorator_tool(\"hello\") == \"HELLO\"\n\n        # Verify the source code extraction is correct\n        forward_source = multi_decorator_tool.forward.__source__\n        assert \"def forward(self, text: str) -> str:\" in forward_source\n        assert \"return text.upper()\" in forward_source\n        # Should not contain decorator lines\n        assert \"@tool\" not in forward_source\n        assert \"@dummy_decorator\" not in forward_source\n        # Should not contain definition line\n        assert \"def multi_decorator_tool\" not in forward_source\n\n    def test_tool_decorator_source_extraction_with_multiline_signature(self):\n        \"\"\"Test that @tool correctly extracts source code with multiline function signatures.\"\"\"\n\n        with warnings.catch_warnings():\n            warnings.simplefilter(\"error\")\n\n            @tool\n            def multiline_signature_tool(\n                text: str,\n                count: int = 1,\n                uppercase: bool = False,\n                multiline_parameter_1: int = 1_000,\n                multiline_parameter_2: int = 2_000,\n            ) -> str:\n                \"\"\"Tool with multiline signature.\n\n                Args:\n                    text: Input text\n                    count: Number of repetitions\n                    uppercase: Whether to convert to uppercase\n                    multiline_parameter_1: Dummy parameter\n                    multiline_parameter_2: Dummy parameter\n                \"\"\"\n                result = text * count\n                return result.upper() if uppercase else result\n\n        # Verify the tool works\n        assert isinstance(multiline_signature_tool, Tool)\n        assert multiline_signature_tool.name == \"multiline_signature_tool\"\n        assert multiline_signature_tool(\"hello\", 2, True) == \"HELLOHELLO\"\n\n        # Verify the source code extraction is correct\n        forward_source = multiline_signature_tool.forward.__source__\n        assert (\n            \"def forward(self, text: str, count: int=1, uppercase: bool=False, multiline_parameter_1: int=1000, multiline_parameter_2: int=2000) -> str:\"\n            in forward_source\n            or \"def forward(self, text: str, count: int = 1, uppercase: bool = False, multiline_parameter_1: int = 1000, multiline_parameter_2: int = 2000) -> str:\"\n            in forward_source\n        )\n        assert \"result = text * count\" in forward_source\n        assert \"return result.upper() if uppercase else result\" in forward_source\n        # Should not contain the original multiline function definition\n        assert \"def multiline_signature_tool(\" not in forward_source\n        # Should not contain leftover lines from the original multiline function definition\n        assert \"            count: int = 1,\" not in forward_source\n        assert \"            count: int=1,\" not in forward_source\n\n    def test_tool_decorator_source_extraction_with_multiple_decorators_and_multiline(self):\n        \"\"\"Test that @tool works with both multiple decorators and multiline signatures.\"\"\"\n\n        def dummy_decorator_1(func):\n            return func\n\n        def dummy_decorator_2(func):\n            return func\n\n        with pytest.warns(UserWarning, match=\"has decorators other than @tool\"):\n\n            @tool\n            @dummy_decorator_1\n            @dummy_decorator_2\n            def complex_tool(\n                text: str,\n                multiplier: int = 2,\n                separator: str = \" \",\n                multiline_parameter_1: int = 1_000,\n                multiline_parameter_2: int = 2_000,\n            ) -> str:\n                \"\"\"Complex tool with multiple decorators and multiline signature.\n\n                Args:\n                    text: Input text\n                    multiplier: How many times to repeat\n                    separator: What to use between repetitions\n                    multiline_parameter_1: Dummy parameter\n                    multiline_parameter_2: Dummy parameter\n                \"\"\"\n                parts = [text] * multiplier\n                return separator.join(parts)\n\n        # Verify the tool works\n        assert isinstance(complex_tool, Tool)\n        assert complex_tool.name == \"complex_tool\"\n        assert complex_tool(\"hello\", 3, \"-\") == \"hello-hello-hello\"\n\n        # Verify the source code extraction is correct\n        forward_source = complex_tool.forward.__source__\n        assert (\n            \"def forward(self, text: str, multiplier: int=2, separator: str=' ', multiline_parameter_1: int=1000, multiline_parameter_2: int=2000) -> str:\"\n            in forward_source\n            or \"def forward(self, text: str, multiplier: int = 2, separator: str = ' ', multiline_parameter_1: int = 1000, multiline_parameter_2: int = 2000) -> str:\"\n            in forward_source\n        )\n        assert \"parts = [text] * multiplier\" in forward_source\n        assert \"return separator.join(parts)\" in forward_source\n        # Should not contain any decorator lines\n        assert \"@tool\" not in forward_source\n        assert \"@dummy_decorator_1\" not in forward_source\n        assert \"@dummy_decorator_2\" not in forward_source\n        # Should not contain leftover lines from the original multiline function definition\n        assert \"            multiplier: int = 2,\" not in forward_source\n        assert \"            multiplier: int=2,\" not in forward_source\n\n\n@pytest.fixture\ndef mock_server_parameters():\n    return MagicMock()\n\n\n@pytest.fixture\ndef mock_mcp_adapt():\n    with patch(\"mcpadapt.core.MCPAdapt\") as mock:\n        mock.return_value.__enter__.return_value = [\"tool1\", \"tool2\"]\n        mock.return_value.__exit__.return_value = None\n        yield mock\n\n\n@pytest.fixture\ndef mock_smolagents_adapter():\n    with patch(\"mcpadapt.smolagents_adapter.SmolAgentsAdapter\") as mock:\n        yield mock\n\n\n# Ignore FutureWarning about structured_output default value change: this test intentionally uses default behavior\n@pytest.mark.filterwarnings(\"ignore:.*structured_output:FutureWarning\")\nclass TestToolCollection:\n    def test_from_mcp(self, mock_server_parameters, mock_mcp_adapt, mock_smolagents_adapter):\n        with ToolCollection.from_mcp(mock_server_parameters, trust_remote_code=True) as tool_collection:\n            assert isinstance(tool_collection, ToolCollection)\n            assert len(tool_collection.tools) == 2\n            assert \"tool1\" in tool_collection.tools\n            assert \"tool2\" in tool_collection.tools\n\n    @require_run_all\n    def test_integration_from_mcp(self):\n        # define the most simple mcp server with one tool that echoes the input text\n        mcp_server_script = dedent(\"\"\"\\\n            from mcp.server.fastmcp import FastMCP\n\n            mcp = FastMCP(\"Echo Server\")\n\n            @mcp.tool()\n            def echo_tool(text: str) -> str:\n                return text\n\n            mcp.run()\n        \"\"\").strip()\n\n        mcp_server_params = mcp.StdioServerParameters(\n            command=\"python\",\n            args=[\"-c\", mcp_server_script],\n        )\n\n        with ToolCollection.from_mcp(mcp_server_params, trust_remote_code=True) as tool_collection:\n            assert len(tool_collection.tools) == 1, \"Expected 1 tool\"\n            assert tool_collection.tools[0].name == \"echo_tool\", \"Expected tool name to be 'echo_tool'\"\n            assert tool_collection.tools[0](text=\"Hello\") == \"Hello\", \"Expected tool to echo the input text\"\n\n    def test_integration_from_mcp_with_streamable_http(self):\n        import subprocess\n        import time\n\n        # define the most simple mcp server with one tool that echoes the input text\n        mcp_server_script = dedent(\"\"\"\\\n            from mcp.server.fastmcp import FastMCP\n\n            mcp = FastMCP(\"Echo Server\", host=\"127.0.0.1\", port=8000)\n\n            @mcp.tool()\n            def echo_tool(text: str) -> str:\n                return text\n\n            mcp.run(transport=\"streamable-http\")\n        \"\"\").strip()\n\n        # start the SSE mcp server in a subprocess\n        server_process = subprocess.Popen(\n            [\"python\", \"-c\", mcp_server_script],\n        )\n\n        # wait for the server to start\n        time.sleep(1)\n\n        try:\n            with ToolCollection.from_mcp(\n                {\"url\": \"http://127.0.0.1:8000/mcp\", \"transport\": \"streamable-http\"}, trust_remote_code=True\n            ) as tool_collection:\n                assert len(tool_collection.tools) == 1, \"Expected 1 tool\"\n                assert tool_collection.tools[0].name == \"echo_tool\", \"Expected tool name to be 'echo_tool'\"\n                assert tool_collection.tools[0](text=\"Hello\") == \"Hello\", \"Expected tool to echo the input text\"\n        finally:\n            # clean up the process when test is done\n            server_process.kill()\n            server_process.wait()\n\n    def test_integration_from_mcp_with_sse(self):\n        import subprocess\n        import time\n\n        # define the most simple mcp server with one tool that echoes the input text\n        mcp_server_script = dedent(\"\"\"\\\n            from mcp.server.fastmcp import FastMCP\n\n            mcp = FastMCP(\"Echo Server\", host=\"127.0.0.1\", port=8000)\n\n            @mcp.tool()\n            def echo_tool(text: str) -> str:\n                return text\n\n            mcp.run(\"sse\")\n        \"\"\").strip()\n\n        # start the SSE mcp server in a subprocess\n        server_process = subprocess.Popen(\n            [\"python\", \"-c\", mcp_server_script],\n        )\n\n        # wait for the server to start\n        time.sleep(1)\n\n        try:\n            with ToolCollection.from_mcp(\n                {\"url\": \"http://127.0.0.1:8000/sse\", \"transport\": \"sse\"}, trust_remote_code=True\n            ) as tool_collection:\n                assert len(tool_collection.tools) == 1, \"Expected 1 tool\"\n                assert tool_collection.tools[0].name == \"echo_tool\", \"Expected tool name to be 'echo_tool'\"\n                assert tool_collection.tools[0](text=\"Hello\") == \"Hello\", \"Expected tool to echo the input text\"\n        finally:\n            # clean up the process when test is done\n            server_process.kill()\n            server_process.wait()\n\n\n@pytest.mark.parametrize(\"tool_fixture_name\", [\"boolean_default_tool_class\"])\ndef test_launch_gradio_demo_does_not_raise(tool_fixture_name, request):\n    tool = request.getfixturevalue(tool_fixture_name)\n    with patch(\"gradio.Interface.launch\") as mock_launch:\n        launch_gradio_demo(tool)\n    assert mock_launch.call_count == 1\n\n\n@pytest.mark.parametrize(\n    \"tool_input_type, expected_input, expects_error\",\n    [\n        (bool, True, False),\n        (str, \"b\", False),\n        (int, 1, False),\n        (float, 1, False),\n        (list, [\"a\", \"b\"], False),\n        (list[str], [\"a\", \"b\"], False),\n        (dict[str, str], {\"a\": \"b\"}, False),\n        (dict[str, str], \"b\", True),\n        (bool, \"b\", True),\n        (str | int, \"a\", False),\n        (str | int, 1, False),\n        (str | int, None, True),\n        (str | int, True, True),\n    ],\n)\ndef test_validate_tool_arguments(tool_input_type, expected_input, expects_error):\n    @tool\n    def test_tool(argument_a: tool_input_type) -> str:\n        \"\"\"Fake tool\n\n        Args:\n            argument_a: The input\n        \"\"\"\n        return argument_a\n\n    if expects_error:\n        with pytest.raises((ValueError, TypeError)):\n            validate_tool_arguments(test_tool, {\"argument_a\": expected_input})\n\n    else:\n        # Should not raise any exception\n        validate_tool_arguments(test_tool, {\"argument_a\": expected_input})\n\n\n@pytest.mark.parametrize(\n    \"scenario, type_hint, default, input_value, expected_error_message\",\n    [\n        # Required parameters (no default)\n        # - Valid input\n        (\"required_unsupported_none\", str, ..., \"text\", None),\n        # - None not allowed\n        (\"required_unsupported_none\", str, ..., None, \"Argument param has type 'null' but should be 'string'\"),\n        # - Missing required parameter is not allowed\n        (\"required_unsupported_none\", str, ..., ..., \"Argument param is required\"),\n        #\n        # Required parameters but supports None\n        # - Valid input\n        (\"required_supported_none\", str | None, ..., \"text\", None),\n        # - None allowed\n        (\"required_supported_none\", str | None, ..., None, None),\n        # - Missing required parameter is not allowed\n        # TODO: Fix this test case: property is marked as nullable because it can be None, but it can't be missing because it is required\n        # (\"required_supported_none\", str | None, ..., ..., \"Argument param is required\"),\n        pytest.param(\n            \"required_supported_none\",\n            str | None,\n            ...,\n            ...,\n            \"Argument param is required\",\n            marks=pytest.mark.skip(reason=\"TODO: Fix this test case\"),\n        ),\n        #\n        # Optional parameters (has default, doesn't support None)\n        # - Valid input\n        (\"optional_unsupported_none\", str, \"default\", \"text\", None),\n        # - None not allowed\n        # TODO: Fix this test case: property is marked as nullable because it has a default value, but it can't be None\n        # (\"optional_unsupported_none\", str, \"default\", None, \"Argument param has type 'null' but should be 'string'\"),\n        pytest.param(\n            \"optional_unsupported_none\",\n            str,\n            \"default\",\n            None,\n            \"Argument param has type 'null' but should be 'string'\",\n            marks=pytest.mark.skip(reason=\"TODO: Fix this test case\"),\n        ),\n        # - Missing optional parameter is allowed\n        (\"optional_unsupported_none\", str, \"default\", ..., None),\n        #\n        # Optional and supports None parameters with string default\n        # - Valid input\n        (\"optional_supported_none_str_default\", str | None, \"default\", \"text\", None),\n        # - None allowed\n        (\"optional_supported_none_str_default\", str | None, \"default\", None, None),\n        # - Missing optional parameter is allowed\n        (\"optional_supported_none_str_default\", str | None, \"default\", ..., None),\n        #\n        # Optional and supports None parameters with None default\n        # - Valid input\n        (\"optional_supported_none_none_default\", str | None, None, \"text\", None),\n        # - None allowed\n        (\"optional_supported_none_none_default\", str | None, None, None, None),\n        # - Missing optional parameter is allowed\n        (\"optional_supported_none_none_default\", str | None, None, ..., None),\n    ],\n)\ndef test_validate_tool_arguments_nullable(scenario, type_hint, default, input_value, expected_error_message):\n    \"\"\"Test validation of tool arguments with focus on nullable properties: optional (with default value) and supporting None value.\"\"\"\n\n    # Create a tool with the appropriate signature\n    if default is ...:  # Using Ellipsis to indicate no default value\n\n        @tool\n        def test_tool(param: type_hint) -> str:\n            \"\"\"Test tool.\n\n            Args:\n                param: Input param\n            \"\"\"\n            return str(param) if param is not None else \"NULL\"\n    else:\n\n        @tool\n        def test_tool(param: type_hint = default) -> str:\n            \"\"\"Test tool.\n\n            Args:\n                param: Input param.\n            \"\"\"\n            return str(param) if param is not None else \"NULL\"\n\n    # Test with the input dictionary\n    input_dict = {\"param\": input_value} if input_value is not ... else {}\n\n    if expected_error_message:\n        with pytest.raises((ValueError, TypeError), match=expected_error_message):\n            validate_tool_arguments(test_tool, input_dict)\n    else:\n        # Should not raise any exception\n        validate_tool_arguments(test_tool, input_dict)\n"
  },
  {
    "path": "tests/test_types.py",
    "content": "# coding=utf-8\n# Copyright 2024 HuggingFace Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\nimport os\nimport tempfile\nimport unittest\nimport uuid\n\nimport PIL.Image\n\nfrom smolagents.agent_types import AgentAudio, AgentImage, AgentText\n\nfrom .utils.markers import require_soundfile, require_torch\n\n\ndef get_new_path(suffix=\"\") -> str:\n    directory = tempfile.mkdtemp()\n    return os.path.join(directory, str(uuid.uuid4()) + suffix)\n\n\n@require_soundfile\n@require_torch\nclass AgentAudioTests(unittest.TestCase):\n    def test_from_tensor(self):\n        import soundfile as sf\n        import torch\n\n        tensor = torch.rand(12, dtype=torch.float64) - 0.5\n        agent_type = AgentAudio(tensor)\n        path = str(agent_type.to_string())\n\n        # Ensure that the tensor and the agent_type's tensor are the same\n        self.assertTrue(torch.allclose(tensor, agent_type.to_raw(), atol=1e-4))\n\n        del agent_type\n\n        # Ensure the path remains even after the object deletion\n        self.assertTrue(os.path.exists(path))\n\n        # Ensure that the file contains the same value as the original tensor\n        new_tensor, _ = sf.read(path)\n        self.assertTrue(torch.allclose(tensor, torch.tensor(new_tensor), atol=1e-4))\n\n    def test_from_string(self):\n        import soundfile as sf\n        import torch\n\n        tensor = torch.rand(12, dtype=torch.float64) - 0.5\n        path = get_new_path(suffix=\".wav\")\n        sf.write(path, tensor, 16000)\n\n        agent_type = AgentAudio(path)\n\n        self.assertTrue(torch.allclose(tensor, agent_type.to_raw(), atol=1e-4))\n        self.assertEqual(agent_type.to_string(), path)\n\n\n@require_torch\nclass TestAgentImage:\n    def test_from_tensor(self):\n        import torch\n\n        tensor = torch.randint(0, 256, (64, 64, 3))\n        agent_type = AgentImage(tensor)\n        path = str(agent_type.to_string())\n\n        # Ensure that the tensor and the agent_type's tensor are the same\n        assert torch.allclose(tensor, agent_type._tensor, atol=1e-4)\n\n        assert isinstance(agent_type.to_raw(), PIL.Image.Image)\n\n        # Ensure the path remains even after the object deletion\n        del agent_type\n        assert os.path.exists(path)\n\n    def test_from_string(self, shared_datadir):\n        path = shared_datadir / \"000000039769.png\"\n        image = PIL.Image.open(path)\n        agent_type = AgentImage(path)\n\n        assert path.samefile(agent_type.to_string())\n        assert image == agent_type.to_raw()\n\n        # Ensure the path remains even after the object deletion\n        del agent_type\n        assert os.path.exists(path)\n\n    def test_from_image(self, shared_datadir):\n        path = shared_datadir / \"000000039769.png\"\n        image = PIL.Image.open(path)\n        agent_type = AgentImage(image)\n\n        assert not path.samefile(agent_type.to_string())\n        assert image == agent_type.to_raw()\n\n        # Ensure the path remains even after the object deletion\n        del agent_type\n        assert os.path.exists(path)\n\n\nclass AgentTextTests(unittest.TestCase):\n    def test_from_string(self):\n        string = \"Hey!\"\n        agent_type = AgentText(string)\n\n        self.assertEqual(string, agent_type.to_string())\n        self.assertEqual(string, agent_type.to_raw())\n"
  },
  {
    "path": "tests/test_utils.py",
    "content": "# coding=utf-8\n# Copyright 2024 HuggingFace Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\nimport inspect\nimport os\nimport textwrap\nimport unittest\n\nimport pytest\nfrom IPython.core.interactiveshell import InteractiveShell\n\nfrom smolagents import Tool\nfrom smolagents.tools import tool\nfrom smolagents.utils import (\n    create_agent_gradio_app_template,\n    get_source,\n    instance_to_source,\n    is_valid_name,\n    parse_code_blobs,\n    parse_json_blob,\n)\n\n\nclass ValidTool(Tool):\n    name = \"valid_tool\"\n    description = \"A valid tool\"\n    inputs = {\"input\": {\"type\": \"string\", \"description\": \"input\"}}\n    output_type = \"string\"\n    simple_attr = \"string\"\n    dict_attr = {\"key\": \"value\"}\n\n    def __init__(self, optional_param=\"default\"):\n        super().__init__()\n        self.param = optional_param\n\n    def forward(self, input: str) -> str:\n        return input.upper()\n\n\n@tool\ndef valid_tool_function(input: str) -> str:\n    \"\"\"A valid tool function.\n\n    Args:\n        input (str): Input string.\n    \"\"\"\n    return input.upper()\n\n\nVALID_TOOL_SOURCE = \"\"\"\\\nfrom smolagents.tools import Tool\n\nclass ValidTool(Tool):\n    name = \"valid_tool\"\n    description = \"A valid tool\"\n    inputs = {'input': {'type': 'string', 'description': 'input'}}\n    output_type = \"string\"\n    simple_attr = \"string\"\n    dict_attr = {'key': 'value'}\n\n    def __init__(self, optional_param=\"default\"):\n        super().__init__()\n        self.param = optional_param\n\n    def forward(self, input: str) -> str:\n        return input.upper()\n\"\"\"\n\nVALID_TOOL_FUNCTION_SOURCE = '''\\\nfrom smolagents.tools import Tool\n\nclass SimpleTool(Tool):\n    name = \"valid_tool_function\"\n    description = \"A valid tool function.\"\n    inputs = {'input': {'type': 'string', 'description': 'Input string.'}}\n    output_type = \"string\"\n\n    def __init__(self):\n        self.is_initialized = True\n\n    def forward(self, input: str) -> str:\n        \"\"\"A valid tool function.\n\n        Args:\n            input (str): Input string.\n        \"\"\"\n        return input.upper()\n'''\n\n\nclass AgentTextTests(unittest.TestCase):\n    def test_parse_code_blobs(self):\n        with pytest.raises(ValueError):\n            parse_code_blobs(\"Wrong blob!\", (\"<code>\", \"</code>\"))\n\n        # Parsing mardkwon with code blobs should work\n        output = parse_code_blobs(\n            \"\"\"\nHere is how to solve the problem:\n<code>\nimport numpy as np\n</code>\n\"\"\",\n            (\"<code>\", \"</code>\"),\n        )\n        assert output == \"import numpy as np\"\n\n        # Parsing pure python code blobs should work\n        code_blob = \"import numpy as np\"\n        output = parse_code_blobs(code_blob, (\"```python\", \"```\"))\n        assert output == code_blob\n\n        # Allow whitespaces after header\n        output = parse_code_blobs(\"<code>    \\ncode_a\\n</code>\", (\"<code>\", \"</code>\"))\n        assert output == \"code_a\"\n\n        # Parsing markdown with code blobs should work\n        output = parse_code_blobs(\n            \"\"\"\nHere is how to solve the problem:\n```python\nimport numpy as np\n```\n\"\"\",\n            (\"<code>\", \"</code>\"),\n        )\n        assert output == \"import numpy as np\"\n\n    def test_multiple_code_blobs(self):\n        test_input = \"<code>\\nFoo\\n</code>\\n\\n<code>\\ncode_a\\n</code>\\n\\n<code>\\ncode_b\\n</code>\"\n        result = parse_code_blobs(test_input, (\"<code>\", \"</code>\"))\n        assert result == \"Foo\\n\\ncode_a\\n\\ncode_b\"\n\n\n@pytest.fixture(scope=\"function\")\ndef ipython_shell():\n    \"\"\"Reset IPython shell before and after each test.\"\"\"\n    shell = InteractiveShell.instance()\n    shell.reset()  # Clean before test\n    yield shell\n    shell.reset()  # Clean after test\n\n\n@pytest.mark.parametrize(\n    \"obj_name, code_blob\",\n    [\n        (\"test_func\", \"def test_func():\\n    return 42\"),\n        (\"TestClass\", \"class TestClass:\\n    ...\"),\n    ],\n)\ndef test_get_source_ipython(ipython_shell, obj_name, code_blob):\n    ipython_shell.run_cell(code_blob, store_history=True)\n    obj = ipython_shell.user_ns[obj_name]\n    assert get_source(obj) == code_blob\n\n\ndef test_get_source_standard_class():\n    class TestClass: ...\n\n    source = get_source(TestClass)\n    assert source == \"class TestClass: ...\"\n    assert source == textwrap.dedent(inspect.getsource(TestClass)).strip()\n\n\ndef test_get_source_standard_function():\n    def test_func(): ...\n\n    source = get_source(test_func)\n    assert source == \"def test_func(): ...\"\n    assert source == textwrap.dedent(inspect.getsource(test_func)).strip()\n\n\ndef test_get_source_ipython_errors_empty_cells(ipython_shell):\n    test_code = textwrap.dedent(\"\"\"class TestClass:\\n    ...\"\"\").strip()\n    ipython_shell.user_ns[\"In\"] = [\"\"]\n    ipython_shell.run_cell(test_code, store_history=True)\n    with pytest.raises(ValueError, match=\"No code cells found in IPython session\"):\n        get_source(ipython_shell.user_ns[\"TestClass\"])\n\n\ndef test_get_source_ipython_errors_definition_not_found(ipython_shell):\n    test_code = textwrap.dedent(\"\"\"class TestClass:\\n    ...\"\"\").strip()\n    ipython_shell.user_ns[\"In\"] = [\"\", \"print('No class definition here')\"]\n    ipython_shell.run_cell(test_code, store_history=True)\n    with pytest.raises(ValueError, match=\"Could not find source code for TestClass in IPython history\"):\n        get_source(ipython_shell.user_ns[\"TestClass\"])\n\n\ndef test_get_source_ipython_errors_type_error():\n    with pytest.raises(TypeError, match=\"Expected class or callable\"):\n        get_source(None)\n\n\n@pytest.mark.parametrize(\n    \"tool, expected_tool_source\", [(ValidTool(), VALID_TOOL_SOURCE), (valid_tool_function, VALID_TOOL_FUNCTION_SOURCE)]\n)\ndef test_instance_to_source(tool, expected_tool_source):\n    tool_source = instance_to_source(tool, base_cls=Tool)\n    assert tool_source == expected_tool_source\n\n\ndef test_e2e_class_tool_save(tmp_path):\n    class TestTool(Tool):\n        name = \"test_tool\"\n        description = \"Test tool description\"\n        inputs = {\n            \"task\": {\n                \"type\": \"string\",\n                \"description\": \"tool input\",\n            }\n        }\n        output_type = \"string\"\n\n        def forward(self, task: str):\n            import IPython  # noqa: F401\n\n            return task\n\n    test_tool = TestTool()\n    test_tool.save(tmp_path, make_gradio_app=True)\n    assert set(os.listdir(tmp_path)) == {\"requirements.txt\", \"app.py\", \"tool.py\"}\n    assert (tmp_path / \"tool.py\").read_text() == textwrap.dedent(\n        \"\"\"\\\n        from typing import Any, Optional\n        from smolagents.tools import Tool\n        import IPython\n\n        class TestTool(Tool):\n            name = \"test_tool\"\n            description = \"Test tool description\"\n            inputs = {'task': {'type': 'string', 'description': 'tool input'}}\n            output_type = \"string\"\n\n            def forward(self, task: str):\n                import IPython  # noqa: F401\n\n                return task\n\n            def __init__(self, *args, **kwargs):\n                self.is_initialized = False\n        \"\"\"\n    )\n    requirements = set((tmp_path / \"requirements.txt\").read_text().split())\n    assert requirements == {\"IPython\", \"smolagents\"}\n    assert (tmp_path / \"app.py\").read_text() == textwrap.dedent(\n        \"\"\"\\\n        from smolagents import launch_gradio_demo\n        from tool import TestTool\n\n        tool = TestTool()\n        launch_gradio_demo(tool)\n        \"\"\"\n    )\n\n\ndef test_e2e_ipython_class_tool_save(tmp_path):\n    shell = InteractiveShell.instance()\n    code_blob = textwrap.dedent(\n        f\"\"\"\\\n        from smolagents.tools import Tool\n        class TestTool(Tool):\n            name = \"test_tool\"\n            description = \"Test tool description\"\n            inputs = {{\"task\": {{\"type\": \"string\",\n                    \"description\": \"tool input\",\n                }}\n            }}\n            output_type = \"string\"\n\n            def forward(self, task: str):\n                import IPython  # noqa: F401\n\n                return task\n        TestTool().save(\"{tmp_path}\", make_gradio_app=True)\n        \"\"\"\n    )\n    assert shell.run_cell(code_blob, store_history=True).success\n    assert set(os.listdir(tmp_path)) == {\"requirements.txt\", \"app.py\", \"tool.py\"}\n    assert (tmp_path / \"tool.py\").read_text() == textwrap.dedent(\n        \"\"\"\\\n        from typing import Any, Optional\n        from smolagents.tools import Tool\n        import IPython\n\n        class TestTool(Tool):\n            name = \"test_tool\"\n            description = \"Test tool description\"\n            inputs = {'task': {'type': 'string', 'description': 'tool input'}}\n            output_type = \"string\"\n\n            def forward(self, task: str):\n                import IPython  # noqa: F401\n\n                return task\n\n            def __init__(self, *args, **kwargs):\n                self.is_initialized = False\n        \"\"\"\n    )\n    requirements = set((tmp_path / \"requirements.txt\").read_text().split())\n    assert requirements == {\"IPython\", \"smolagents\"}\n    assert (tmp_path / \"app.py\").read_text() == textwrap.dedent(\n        \"\"\"\\\n        from smolagents import launch_gradio_demo\n        from tool import TestTool\n\n        tool = TestTool()\n        launch_gradio_demo(tool)\n        \"\"\"\n    )\n\n\ndef test_e2e_function_tool_save(tmp_path):\n    @tool\n    def test_tool(task: str) -> str:\n        \"\"\"\n        Test tool description\n\n        Args:\n            task: tool input\n        \"\"\"\n        import IPython  # noqa: F401\n\n        return task\n\n    test_tool.save(tmp_path, make_gradio_app=True)\n    assert set(os.listdir(tmp_path)) == {\"requirements.txt\", \"app.py\", \"tool.py\"}\n    assert (tmp_path / \"tool.py\").read_text() == textwrap.dedent(\n        \"\"\"\\\n        from smolagents import Tool\n        from typing import Any, Optional\n\n        class SimpleTool(Tool):\n            name = \"test_tool\"\n            description = \"Test tool description\"\n            inputs = {'task': {'type': 'string', 'description': 'tool input'}}\n            output_type = \"string\"\n\n            def forward(self, task: str) -> str:\n                \\\"\"\"\n                Test tool description\n\n                Args:\n                    task: tool input\n                \\\"\"\"\n                import IPython  # noqa: F401\n\n                return task\"\"\"\n    )\n    requirements = set((tmp_path / \"requirements.txt\").read_text().split())\n    assert requirements == {\"smolagents\"}  # FIXME: IPython should be in the requirements\n    assert (tmp_path / \"app.py\").read_text() == textwrap.dedent(\n        \"\"\"\\\n        from smolagents import launch_gradio_demo\n        from tool import SimpleTool\n\n        tool = SimpleTool()\n        launch_gradio_demo(tool)\n        \"\"\"\n    )\n\n\ndef test_e2e_ipython_function_tool_save(tmp_path):\n    shell = InteractiveShell.instance()\n    code_blob = textwrap.dedent(\n        f\"\"\"\n        from smolagents import tool\n\n        @tool\n        def test_tool(task: str) -> str:\n            \\\"\"\"\n            Test tool description\n\n            Args:\n                task: tool input\n            \\\"\"\"\n            import IPython  # noqa: F401\n\n            return task\n\n        test_tool.save(\"{tmp_path}\", make_gradio_app=True)\n        \"\"\"\n    )\n    assert shell.run_cell(code_blob, store_history=True).success\n    assert set(os.listdir(tmp_path)) == {\"requirements.txt\", \"app.py\", \"tool.py\"}\n    assert (tmp_path / \"tool.py\").read_text() == textwrap.dedent(\n        \"\"\"\\\n        from smolagents import Tool\n        from typing import Any, Optional\n\n        class SimpleTool(Tool):\n            name = \"test_tool\"\n            description = \"Test tool description\"\n            inputs = {'task': {'type': 'string', 'description': 'tool input'}}\n            output_type = \"string\"\n\n            def forward(self, task: str) -> str:\n                \\\"\"\"\n                Test tool description\n\n                Args:\n                    task: tool input\n                \\\"\"\"\n                import IPython  # noqa: F401\n\n                return task\"\"\"\n    )\n    requirements = set((tmp_path / \"requirements.txt\").read_text().split())\n    assert requirements == {\"smolagents\"}  # FIXME: IPython should be in the requirements\n    assert (tmp_path / \"app.py\").read_text() == textwrap.dedent(\n        \"\"\"\\\n        from smolagents import launch_gradio_demo\n        from tool import SimpleTool\n\n        tool = SimpleTool()\n        launch_gradio_demo(tool)\n        \"\"\"\n    )\n\n\n@pytest.mark.parametrize(\n    \"raw_json, expected_data, expected_blob\",\n    [\n        (\n            \"\"\"{}\"\"\",\n            {},\n            \"\",\n        ),\n        (\n            \"\"\"Text{}\"\"\",\n            {},\n            \"Text\",\n        ),\n        (\n            \"\"\"{\"simple\": \"json\"}\"\"\",\n            {\"simple\": \"json\"},\n            \"\",\n        ),\n        (\n            \"\"\"With text here{\"simple\": \"json\"}\"\"\",\n            {\"simple\": \"json\"},\n            \"With text here\",\n        ),\n        (\n            \"\"\"{\"simple\": \"json\"}With text after\"\"\",\n            {\"simple\": \"json\"},\n            \"\",\n        ),\n        (\n            \"\"\"With text before{\"simple\": \"json\"}And text after\"\"\",\n            {\"simple\": \"json\"},\n            \"With text before\",\n        ),\n    ],\n)\ndef test_parse_json_blob_with_valid_json(raw_json, expected_data, expected_blob):\n    data, blob = parse_json_blob(raw_json)\n\n    assert data == expected_data\n    assert blob == expected_blob\n\n\n@pytest.mark.parametrize(\n    \"raw_json\",\n    [\n        \"\"\"simple\": \"json\"}\"\"\",\n        \"\"\"With text here\"simple\": \"json\"}\"\"\",\n        \"\"\"{\"simple\": \"\"json\"}With text after\"\"\",\n        \"\"\"{\"simple\": \"json\"With text after\"\"\",\n        \"}}\",\n    ],\n)\ndef test_parse_json_blob_with_invalid_json(raw_json):\n    with pytest.raises(Exception):\n        parse_json_blob(raw_json)\n\n\n@pytest.mark.parametrize(\n    \"name,expected\",\n    [\n        # Valid identifiers\n        (\"valid_name\", True),\n        (\"ValidName\", True),\n        (\"valid123\", True),\n        (\"_private\", True),\n        # Invalid identifiers\n        (\"\", False),\n        (\"123invalid\", False),\n        (\"invalid-name\", False),\n        (\"invalid name\", False),\n        (\"invalid.name\", False),\n        # Python keywords\n        (\"if\", False),\n        (\"for\", False),\n        (\"class\", False),\n        (\"return\", False),\n        # Non-string inputs\n        (123, False),\n        (None, False),\n        ([], False),\n        ({}, False),\n    ],\n)\ndef test_is_valid_name(name, expected):\n    \"\"\"Test the is_valid_name function with various inputs.\"\"\"\n    assert is_valid_name(name) is expected\n\n\ndef test_agent_gradio_app_template_excludes_class_keyword():\n    \"\"\"Test that the AGENT_GRADIO_APP_TEMPLATE excludes 'class' from agent kwargs.\"\"\"\n\n    # Mock agent_dict with 'class' key that should be excluded\n    agent_dict = {\n        \"model\": {\"class\": \"CodeAgent\", \"data\": {}},\n        \"class\": \"CodeAgent\",  # This should be excluded to prevent SyntaxError\n        \"some_valid_attr\": \"value\",\n        \"tools\": [],\n        \"managed_agents\": {},\n        \"requirements\": [],\n        \"prompt_templates\": {},\n    }\n\n    template = create_agent_gradio_app_template()\n    result = template.render(\n        agent_name=\"test_agent\",\n        class_name=\"CodeAgent\",\n        agent_dict=agent_dict,\n        tools={},\n        managed_agents={},\n        managed_agent_relative_path=\"\",\n    )\n\n    # Should contain valid attribute but not 'class='  as a keyword argument\n    assert \"some_valid_attr='value',\" in result\n    assert \"class=\" not in result\n\n    # Verify the generated code is syntactically valid Python\n    import ast\n\n    try:\n        ast.parse(result)\n    except SyntaxError as e:\n        pytest.fail(f\"Generated app.py contains syntax error: {e}\")\n"
  },
  {
    "path": "tests/test_vision_web_browser.py",
    "content": "\"\"\"Test XPath injection vulnerability fix in vision_web_browser.py\"\"\"\n\nfrom unittest.mock import Mock, patch\n\nimport pytest\n\nfrom smolagents.vision_web_browser import _escape_xpath_string, search_item_ctrl_f\n\n\n@pytest.fixture\ndef mock_driver():\n    \"\"\"Mock Selenium WebDriver\"\"\"\n    driver = Mock()\n    driver.find_elements.return_value = [Mock()]  # Mock found elements\n    driver.execute_script.return_value = None\n    return driver\n\n\nclass TestXPathEscaping:\n    \"\"\"Test XPath string escaping functionality\"\"\"\n\n    @pytest.mark.parametrize(\n        \"input_text,expected_pattern\",\n        [\n            (\"normal text\", \"'normal text'\"),\n            (\"text with 'quote'\", \"\\\"text with 'quote'\\\"\"),\n            ('text with \"quote\"', \"'text with \\\"quote\\\"'\"),\n            (\"text with one single'quote\", '\"text with one single\\'quote\"'),\n            ('text with one double\"quote', \"'text with one double\\\"quote'\"),\n            (\n                \"text with both 'single' and \\\"double\\\" quotes\",\n                \"concat('text with both ', \\\"'\\\", 'single', \\\"'\\\", ' and \\\"double\\\" quotes')\",\n            ),\n            (\"\", \"''\"),\n            (\"'\", '\"\\'\"'),\n            ('\"', \"'\\\"'\"),\n        ],\n    )\n    def test_escape_xpath_string_basic(self, input_text, expected_pattern):\n        \"\"\"Test basic XPath escaping cases\"\"\"\n        result = _escape_xpath_string(input_text)\n        assert result == expected_pattern\n\n    @pytest.mark.parametrize(\n        \"input_text\",\n        [\n            \"text with both 'single' and \\\"double\\\" quotes\",\n            'it\\'s a \"test\" case',\n            \"'mixed\\\" quotes'\",\n        ],\n    )\n    def test_escape_xpath_string_mixed_quotes(self, input_text):\n        \"\"\"Test XPath escaping with mixed quotes uses concat()\"\"\"\n        result = _escape_xpath_string(input_text)\n        assert result.startswith(\"concat(\")\n        assert result.endswith(\")\")\n\n    @pytest.mark.parametrize(\n        \"malicious_input\",\n        [\n            \"')] | //script[@src='evil.js'] | foo[contains(text(), '\",\n            \"') or 1=1 or ('\",\n            \"')] | //user[contains(@role,'admin')] | foo[contains(text(), '\",\n            \"') and substring(//user[1]/password,1,1)='a\",\n        ],\n    )\n    def test_escape_prevents_injection(self, malicious_input):\n        \"\"\"Test that malicious XPath injection attempts are safely escaped\"\"\"\n        result = _escape_xpath_string(malicious_input)\n        # Should either be wrapped in quotes or use concat()\n        assert (\n            (result.startswith(\"'\") and result.endswith(\"'\"))\n            or (result.startswith('\"') and result.endswith('\"'))\n            or result.startswith(\"concat(\")\n        )\n\n\nclass TestSearchItemCtrlF:\n    \"\"\"Test the search_item_ctrl_f function with XPath injection protection\"\"\"\n\n    @pytest.mark.parametrize(\n        \"search_text\",\n        [\n            \"normal search\",\n            \"search with 'quotes'\",\n            'search with \"quotes\"',\n            \"')] | //script[@src='evil.js'] | foo[contains(text(), '\",\n            \"') or 1=1 or ('\",\n        ],\n    )\n    def test_search_item_prevents_injection(self, search_text, mock_driver):\n        \"\"\"Test that search_item_ctrl_f prevents XPath injection\"\"\"\n        with patch(\"smolagents.vision_web_browser.driver\", mock_driver, create=True):\n            # Call the function\n            result = search_item_ctrl_f(search_text)\n\n            # Verify driver.find_elements was called\n            mock_driver.find_elements.assert_called_once()\n\n            # Get the actual XPath query that was generated\n            call_args = mock_driver.find_elements.call_args\n            xpath_query = call_args[0][1]  # Second positional argument\n\n            # Verify the query doesn't contain unescaped injection\n            if \"')] | //\" in search_text:\n                # For injection attempts, verify they're properly escaped\n                # The query should either use concat() or be properly quoted\n                is_concat = \"concat(\" in xpath_query\n                is_properly_quoted = xpath_query.count('\"') >= 2 or xpath_query.count(\"'\") >= 2\n                assert is_concat or is_properly_quoted, f\"XPath injection not prevented: {xpath_query}\"\n\n            # Verify we got a result\n            assert \"Found\" in result\n\n    def test_search_item_nth_result(self, mock_driver):\n        \"\"\"Test nth_result parameter works correctly\"\"\"\n        mock_driver.find_elements.return_value = [Mock(), Mock(), Mock()]  # 3 elements\n\n        with patch(\"smolagents.vision_web_browser.driver\", mock_driver, create=True):\n            result = search_item_ctrl_f(\"test\", nth_result=2)\n\n            # Should find 3 matches and focus on element 2\n            assert \"Found 3 matches\" in result\n            assert \"Focused on element 2 of 3\" in result\n\n    def test_search_item_not_found(self, mock_driver):\n        \"\"\"Test exception when nth_result exceeds available matches\"\"\"\n        mock_driver.find_elements.return_value = [Mock()]  # Only 1 element\n\n        with patch(\"smolagents.vision_web_browser.driver\", mock_driver, create=True):\n            with pytest.raises(Exception, match=\"Match n°3 not found\"):\n                search_item_ctrl_f(\"test\", nth_result=3)\n"
  },
  {
    "path": "tests/utils/markers.py",
    "content": "# coding=utf-8\n# Copyright 2024 HuggingFace Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\"\"\"Markers for tests .\"\"\"\n\nimport os\nfrom importlib.util import find_spec\n\nimport pytest\n\n\nrequire_run_all = pytest.mark.skipif(not os.getenv(\"RUN_ALL\"), reason=\"requires RUN_ALL environment variable\")\nrequire_soundfile = pytest.mark.skipif(find_spec(\"soundfile\") is None, reason=\"requires soundfile\")\nrequire_torch = pytest.mark.skipif(find_spec(\"torch\") is None, reason=\"requires torch\")\n"
  }
]