Showing preview only (206K chars total). Download the full file or copy to clipboard to get everything.
Repository: ollama/ollama-python
Branch: main
Commit: dbccf192ac6b
Files: 52
Total size: 192.8 KB
Directory structure:
gitextract_5oebzo1m/
├── .github/
│ ├── dependabot.yml
│ └── workflows/
│ ├── publish.yaml
│ └── test.yaml
├── .gitignore
├── LICENSE
├── README.md
├── SECURITY.md
├── examples/
│ ├── README.md
│ ├── async-chat.py
│ ├── async-generate.py
│ ├── async-structured-outputs.py
│ ├── async-tools.py
│ ├── chat-logprobs.py
│ ├── chat-stream.py
│ ├── chat-with-history.py
│ ├── chat.py
│ ├── create.py
│ ├── embed.py
│ ├── fill-in-middle.py
│ ├── generate-image.py
│ ├── generate-logprobs.py
│ ├── generate-stream.py
│ ├── generate.py
│ ├── gpt-oss-tools-stream.py
│ ├── gpt-oss-tools.py
│ ├── list.py
│ ├── multi-tool.py
│ ├── multimodal-chat.py
│ ├── multimodal-generate.py
│ ├── ps.py
│ ├── pull.py
│ ├── show.py
│ ├── structured-outputs-image.py
│ ├── structured-outputs.py
│ ├── thinking-generate.py
│ ├── thinking-levels.py
│ ├── thinking.py
│ ├── tools.py
│ ├── web-search-gpt-oss.py
│ ├── web-search-mcp.py
│ ├── web-search.py
│ └── web_search_gpt_oss_helper.py
├── ollama/
│ ├── __init__.py
│ ├── _client.py
│ ├── _types.py
│ ├── _utils.py
│ └── py.typed
├── pyproject.toml
├── requirements.txt
└── tests/
├── test_client.py
├── test_type_serialization.py
└── test_utils.py
================================================
FILE CONTENTS
================================================
================================================
FILE: .github/dependabot.yml
================================================
version: 2
updates:
- package-ecosystem: github-actions
directory: /
schedule:
interval: daily
- package-ecosystem: pip
directory: /
schedule:
interval: daily
================================================
FILE: .github/workflows/publish.yaml
================================================
name: publish
on:
release:
types:
- created
jobs:
publish:
runs-on: ubuntu-latest
environment: release
permissions:
id-token: write
contents: write
steps:
- uses: actions/checkout@v6
- uses: actions/setup-python@v6
- uses: astral-sh/setup-uv@v5
with:
enable-cache: true
- run: uv build
- uses: pypa/gh-action-pypi-publish@release/v1
- run: gh release upload $GITHUB_REF_NAME dist/*
env:
GH_TOKEN: ${{ github.token }}
================================================
FILE: .github/workflows/test.yaml
================================================
name: test
on:
push:
branches:
- main
pull_request:
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: astral-sh/setup-uv@v5
with:
enable-cache: true
- run: uvx hatch test -acp
if: ${{ always() }}
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions/setup-python@v6
- uses: astral-sh/setup-uv@v5
with:
enable-cache: true
- name: check formatting
run: uvx hatch fmt --check -f
- name: check linting
run: uvx hatch fmt --check -l --output-format=github
- name: check uv.lock is up-to-date
run: uv lock --check
- name: check requirements.txt is up-to-date
run: |
uv export >requirements.txt
git diff --exit-code requirements.txt
================================================
FILE: .gitignore
================================================
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock
# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
================================================
FILE: LICENSE
================================================
MIT License
Copyright (c) Ollama
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
================================================
FILE: README.md
================================================
# Ollama Python Library
The Ollama Python library provides the easiest way to integrate Python 3.8+ projects with [Ollama](https://github.com/ollama/ollama).
## Prerequisites
- [Ollama](https://ollama.com/download) should be installed and running
- Pull a model to use with the library: `ollama pull <model>` e.g. `ollama pull gemma3`
- See [Ollama.com](https://ollama.com/search) for more information on the models available.
## Install
```sh
pip install ollama
```
## Usage
```python
from ollama import chat
from ollama import ChatResponse
response: ChatResponse = chat(model='gemma3', messages=[
{
'role': 'user',
'content': 'Why is the sky blue?',
},
])
print(response['message']['content'])
# or access fields directly from the response object
print(response.message.content)
```
See [_types.py](ollama/_types.py) for more information on the response types.
## Streaming responses
Response streaming can be enabled by setting `stream=True`.
```python
from ollama import chat
stream = chat(
model='gemma3',
messages=[{'role': 'user', 'content': 'Why is the sky blue?'}],
stream=True,
)
for chunk in stream:
print(chunk['message']['content'], end='', flush=True)
```
## Cloud Models
Run larger models by offloading to Ollama’s cloud while keeping your local workflow.
- Supported models: `deepseek-v3.1:671b-cloud`, `gpt-oss:20b-cloud`, `gpt-oss:120b-cloud`, `kimi-k2:1t-cloud`, `qwen3-coder:480b-cloud`, `kimi-k2-thinking` See [Ollama Models - Cloud](https://ollama.com/search?c=cloud) for more information
### Run via local Ollama
1) Sign in (one-time):
```
ollama signin
```
2) Pull a cloud model:
```
ollama pull gpt-oss:120b-cloud
```
3) Make a request:
```python
from ollama import Client
client = Client()
messages = [
{
'role': 'user',
'content': 'Why is the sky blue?',
},
]
for part in client.chat('gpt-oss:120b-cloud', messages=messages, stream=True):
print(part.message.content, end='', flush=True)
```
### Cloud API (ollama.com)
Access cloud models directly by pointing the client at `https://ollama.com`.
1) Create an API key from [ollama.com](https://ollama.com/settings/keys) , then set:
```
export OLLAMA_API_KEY=your_api_key
```
2) (Optional) List models available via the API:
```
curl https://ollama.com/api/tags
```
3) Generate a response via the cloud API:
```python
import os
from ollama import Client
client = Client(
host='https://ollama.com',
headers={'Authorization': 'Bearer ' + os.environ.get('OLLAMA_API_KEY')}
)
messages = [
{
'role': 'user',
'content': 'Why is the sky blue?',
},
]
for part in client.chat('gpt-oss:120b', messages=messages, stream=True):
print(part.message.content, end='', flush=True)
```
## Custom client
A custom client can be created by instantiating `Client` or `AsyncClient` from `ollama`.
All extra keyword arguments are passed into the [`httpx.Client`](https://www.python-httpx.org/api/#client).
```python
from ollama import Client
client = Client(
host='http://localhost:11434',
headers={'x-some-header': 'some-value'}
)
response = client.chat(model='gemma3', messages=[
{
'role': 'user',
'content': 'Why is the sky blue?',
},
])
```
## Async client
The `AsyncClient` class is used to make asynchronous requests. It can be configured with the same fields as the `Client` class.
```python
import asyncio
from ollama import AsyncClient
async def chat():
message = {'role': 'user', 'content': 'Why is the sky blue?'}
response = await AsyncClient().chat(model='gemma3', messages=[message])
asyncio.run(chat())
```
Setting `stream=True` modifies functions to return a Python asynchronous generator:
```python
import asyncio
from ollama import AsyncClient
async def chat():
message = {'role': 'user', 'content': 'Why is the sky blue?'}
async for part in await AsyncClient().chat(model='gemma3', messages=[message], stream=True):
print(part['message']['content'], end='', flush=True)
asyncio.run(chat())
```
## API
The Ollama Python library's API is designed around the [Ollama REST API](https://github.com/ollama/ollama/blob/main/docs/api.md)
### Chat
```python
ollama.chat(model='gemma3', messages=[{'role': 'user', 'content': 'Why is the sky blue?'}])
```
### Generate
```python
ollama.generate(model='gemma3', prompt='Why is the sky blue?')
```
### List
```python
ollama.list()
```
### Show
```python
ollama.show('gemma3')
```
### Create
```python
ollama.create(model='example', from_='gemma3', system="You are Mario from Super Mario Bros.")
```
### Copy
```python
ollama.copy('gemma3', 'user/gemma3')
```
### Delete
```python
ollama.delete('gemma3')
```
### Pull
```python
ollama.pull('gemma3')
```
### Push
```python
ollama.push('user/gemma3')
```
### Embed
```python
ollama.embed(model='gemma3', input='The sky is blue because of rayleigh scattering')
```
### Embed (batch)
```python
ollama.embed(model='gemma3', input=['The sky is blue because of rayleigh scattering', 'Grass is green because of chlorophyll'])
```
### Ps
```python
ollama.ps()
```
## Errors
Errors are raised if requests return an error status or if an error is detected while streaming.
```python
model = 'does-not-yet-exist'
try:
ollama.chat(model)
except ollama.ResponseError as e:
print('Error:', e.error)
if e.status_code == 404:
ollama.pull(model)
```
================================================
FILE: SECURITY.md
================================================
# Security
The Ollama maintainer team takes security seriously and will actively work to resolve security issues.
## Reporting a vulnerability
If you discover a security vulnerability, please do not open a public issue. Instead, please report it by emailing hello@ollama.com. We ask that you give us sufficient time to investigate and address the vulnerability before disclosing it publicly.
Please include the following details in your report:
- A description of the vulnerability
- Steps to reproduce the issue
- Your assessment of the potential impact
- Any possible mitigations
## Security best practices
While the maintainer team does their best to secure Ollama, users are encouraged to implement their own security best practices, such as:
- Regularly updating to the latest version of Ollama
- Securing access to hosted instances of Ollama
- Monitoring systems for unusual activity
## Contact
For any other questions or concerns related to security, please contact us at hello@ollama.com
================================================
FILE: examples/README.md
================================================
# Running Examples
Run the examples in this directory with:
```sh
# Run example
python3 examples/<example>.py
# or with uv
uv run examples/<example>.py
```
See [ollama/docs/api.md](https://github.com/ollama/ollama/blob/main/docs/api.md) for full API documentation
### Chat - Chat with a model
- [chat.py](chat.py)
- [async-chat.py](async-chat.py)
- [chat-stream.py](chat-stream.py) - Streamed outputs
- [chat-with-history.py](chat-with-history.py) - Chat with model and maintain history of the conversation
### Generate - Generate text with a model
- [generate.py](generate.py)
- [async-generate.py](async-generate.py)
- [generate-stream.py](generate-stream.py) - Streamed outputs
- [fill-in-middle.py](fill-in-middle.py) - Given a prefix and suffix, fill in the middle
### Tools/Function Calling - Call a function with a model
- [tools.py](tools.py) - Simple example of Tools/Function Calling
- [async-tools.py](async-tools.py)
- [multi-tool.py](multi-tool.py) - Using multiple tools, with thinking enabled
#### gpt-oss
- [gpt-oss-tools.py](gpt-oss-tools.py)
- [gpt-oss-tools-stream.py](gpt-oss-tools-stream.py)
### Web search
An API key from Ollama's cloud service is required. You can create one [here](https://ollama.com/settings/keys).
```shell
export OLLAMA_API_KEY="your_api_key_here"
```
- [web-search.py](web-search.py)
- [web-search-gpt-oss.py](web-search-gpt-oss.py) - Using browser research tools with gpt-oss
#### MCP server
The MCP server can be used with an MCP client like Cursor, Cline, Codex, Open WebUI, Goose, and more.
```sh
uv run examples/web-search-mcp.py
```
Configuration to use with an MCP client:
```json
{
"mcpServers": {
"web_search": {
"type": "stdio",
"command": "uv",
"args": ["run", "path/to/ollama-python/examples/web-search-mcp.py"],
"env": { "OLLAMA_API_KEY": "your_api_key_here" }
}
}
}
```
- [web-search-mcp.py](web-search-mcp.py)
### Multimodal with Images - Chat with a multimodal (image chat) model
- [multimodal-chat.py](multimodal-chat.py)
- [multimodal-generate.py](multimodal-generate.py)
### Image Generation (Experimental) - Generate images with a model
> **Note:** Image generation is experimental and currently only available on macOS.
- [generate-image.py](generate-image.py)
### Structured Outputs - Generate structured outputs with a model
- [structured-outputs.py](structured-outputs.py)
- [async-structured-outputs.py](async-structured-outputs.py)
- [structured-outputs-image.py](structured-outputs-image.py)
### Ollama List - List all downloaded models and their properties
- [list.py](list.py)
### Ollama Show - Display model properties and capabilities
- [show.py](show.py)
### Ollama ps - Show model status with CPU/GPU usage
- [ps.py](ps.py)
### Ollama Pull - Pull a model from Ollama
Requirement: `pip install tqdm`
- [pull.py](pull.py)
### Ollama Create - Create a model from a Modelfile
- [create.py](create.py)
### Ollama Embed - Generate embeddings with a model
- [embed.py](embed.py)
### Thinking - Enable thinking mode for a model
- [thinking.py](thinking.py)
### Thinking (generate) - Enable thinking mode for a model
- [thinking-generate.py](thinking-generate.py)
### Thinking (levels) - Choose the thinking level
- [thinking-levels.py](thinking-levels.py)
================================================
FILE: examples/async-chat.py
================================================
import asyncio
from ollama import AsyncClient
async def main():
messages = [
{
'role': 'user',
'content': 'Why is the sky blue?',
},
]
client = AsyncClient()
response = await client.chat('gemma3', messages=messages)
print(response['message']['content'])
if __name__ == '__main__':
asyncio.run(main())
================================================
FILE: examples/async-generate.py
================================================
import asyncio
import ollama
async def main():
client = ollama.AsyncClient()
response = await client.generate('gemma3', 'Why is the sky blue?')
print(response['response'])
if __name__ == '__main__':
try:
asyncio.run(main())
except KeyboardInterrupt:
print('\nGoodbye!')
================================================
FILE: examples/async-structured-outputs.py
================================================
import asyncio
from pydantic import BaseModel
from ollama import AsyncClient
# Define the schema for the response
class FriendInfo(BaseModel):
name: str
age: int
is_available: bool
class FriendList(BaseModel):
friends: list[FriendInfo]
async def main():
client = AsyncClient()
response = await client.chat(
model='llama3.1:8b',
messages=[{'role': 'user', 'content': 'I have two friends. The first is Ollama 22 years old busy saving the world, and the second is Alonso 23 years old and wants to hang out. Return a list of friends in JSON format'}],
format=FriendList.model_json_schema(), # Use Pydantic to generate the schema
options={'temperature': 0}, # Make responses more deterministic
)
# Use Pydantic to validate the response
friends_response = FriendList.model_validate_json(response.message.content)
print(friends_response)
if __name__ == '__main__':
asyncio.run(main())
================================================
FILE: examples/async-tools.py
================================================
import asyncio
import ollama
from ollama import ChatResponse
def add_two_numbers(a: int, b: int) -> int:
"""
Add two numbers
Args:
a (int): The first number
b (int): The second number
Returns:
int: The sum of the two numbers
"""
return a + b
def subtract_two_numbers(a: int, b: int) -> int:
"""
Subtract two numbers
"""
return a - b
# Tools can still be manually defined and passed into chat
subtract_two_numbers_tool = {
'type': 'function',
'function': {
'name': 'subtract_two_numbers',
'description': 'Subtract two numbers',
'parameters': {
'type': 'object',
'required': ['a', 'b'],
'properties': {
'a': {'type': 'integer', 'description': 'The first number'},
'b': {'type': 'integer', 'description': 'The second number'},
},
},
},
}
messages = [{'role': 'user', 'content': 'What is three plus one?'}]
print('Prompt:', messages[0]['content'])
available_functions = {
'add_two_numbers': add_two_numbers,
'subtract_two_numbers': subtract_two_numbers,
}
async def main():
client = ollama.AsyncClient()
response: ChatResponse = await client.chat(
'llama3.1',
messages=messages,
tools=[add_two_numbers, subtract_two_numbers_tool],
)
if response.message.tool_calls:
# There may be multiple tool calls in the response
for tool in response.message.tool_calls:
# Ensure the function is available, and then call it
if function_to_call := available_functions.get(tool.function.name):
print('Calling function:', tool.function.name)
print('Arguments:', tool.function.arguments)
output = function_to_call(**tool.function.arguments)
print('Function output:', output)
else:
print('Function', tool.function.name, 'not found')
# Only needed to chat with the model using the tool call results
if response.message.tool_calls:
# Add the function response to messages for the model to use
messages.append(response.message)
messages.append({'role': 'tool', 'content': str(output), 'tool_name': tool.function.name})
# Get final response from model with function outputs
final_response = await client.chat('llama3.1', messages=messages)
print('Final response:', final_response.message.content)
else:
print('No tool calls returned from model')
if __name__ == '__main__':
try:
asyncio.run(main())
except KeyboardInterrupt:
print('\nGoodbye!')
================================================
FILE: examples/chat-logprobs.py
================================================
from typing import Iterable
import ollama
def print_logprobs(logprobs: Iterable[dict], label: str) -> None:
print(f'\n{label}:')
for entry in logprobs:
token = entry.get('token', '')
logprob = entry.get('logprob')
print(f' token={token!r:<12} logprob={logprob:.3f}')
for alt in entry.get('top_logprobs', []):
if alt['token'] != token:
print(f' alt -> {alt["token"]!r:<12} ({alt["logprob"]:.3f})')
messages = [
{
'role': 'user',
'content': 'hi! be concise.',
},
]
response = ollama.chat(
model='gemma3',
messages=messages,
logprobs=True,
top_logprobs=3,
)
print('Chat response:', response['message']['content'])
print_logprobs(response.get('logprobs', []), 'chat logprobs')
================================================
FILE: examples/chat-stream.py
================================================
from ollama import chat
messages = [
{
'role': 'user',
'content': 'Why is the sky blue?',
},
]
for part in chat('gemma3', messages=messages, stream=True):
print(part['message']['content'], end='', flush=True)
================================================
FILE: examples/chat-with-history.py
================================================
from ollama import chat
messages = [
{
'role': 'user',
'content': 'Why is the sky blue?',
},
{
'role': 'assistant',
'content': "The sky is blue because of the way the Earth's atmosphere scatters sunlight.",
},
{
'role': 'user',
'content': 'What is the weather in Tokyo?',
},
{
'role': 'assistant',
'content': """The weather in Tokyo is typically warm and humid during the summer months, with temperatures often exceeding 30°C (86°F). The city experiences a rainy season from June to September, with heavy rainfall and occasional typhoons. Winter is mild, with temperatures
rarely dropping below freezing. The city is known for its high-tech and vibrant culture, with many popular tourist attractions such as the Tokyo Tower, Senso-ji Temple, and the bustling Shibuya district.""",
},
]
while True:
user_input = input('Chat with history: ')
response = chat(
'gemma3',
messages=[*messages, {'role': 'user', 'content': user_input}],
)
# Add the response to the messages to maintain the history
messages += [
{'role': 'user', 'content': user_input},
{'role': 'assistant', 'content': response.message.content},
]
print(response.message.content + '\n')
================================================
FILE: examples/chat.py
================================================
from ollama import chat
messages = [
{
'role': 'user',
'content': 'Why is the sky blue?',
},
]
response = chat('gemma3', messages=messages)
print(response['message']['content'])
================================================
FILE: examples/create.py
================================================
from ollama import Client
client = Client()
response = client.create(
model='my-assistant',
from_='gemma3',
system='You are mario from Super Mario Bros.',
stream=False,
)
print(response.status)
================================================
FILE: examples/embed.py
================================================
from ollama import embed
response = embed(model='llama3.2', input='Hello, world!')
print(response['embeddings'])
================================================
FILE: examples/fill-in-middle.py
================================================
from ollama import generate
prompt = '''def remove_non_ascii(s: str) -> str:
""" '''
suffix = """
return result
"""
response = generate(
model='codellama:7b-code',
prompt=prompt,
suffix=suffix,
options={
'num_predict': 128,
'temperature': 0,
'top_p': 0.9,
'stop': ['<EOT>'],
},
)
print(response['response'])
================================================
FILE: examples/generate-image.py
================================================
# Image generation is experimental and currently only available on macOS
import base64
from ollama import generate
prompt = 'a sunset over mountains'
print(f'Prompt: {prompt}')
for response in generate(model='x/z-image-turbo', prompt=prompt, stream=True):
if response.image:
# Final response contains the image
with open('output.png', 'wb') as f:
f.write(base64.b64decode(response.image))
print('\nImage saved to output.png')
elif response.total:
# Progress update
print(f'Progress: {response.completed or 0}/{response.total}', end='\r')
================================================
FILE: examples/generate-logprobs.py
================================================
from typing import Iterable
import ollama
def print_logprobs(logprobs: Iterable[dict], label: str) -> None:
print(f'\n{label}:')
for entry in logprobs:
token = entry.get('token', '')
logprob = entry.get('logprob')
print(f' token={token!r:<12} logprob={logprob:.3f}')
for alt in entry.get('top_logprobs', []):
if alt['token'] != token:
print(f' alt -> {alt["token"]!r:<12} ({alt["logprob"]:.3f})')
response = ollama.generate(
model='gemma3',
prompt='hi! be concise.',
logprobs=True,
top_logprobs=3,
)
print('Generate response:', response['response'])
print_logprobs(response.get('logprobs', []), 'generate logprobs')
================================================
FILE: examples/generate-stream.py
================================================
from ollama import generate
for part in generate('gemma3', 'Why is the sky blue?', stream=True):
print(part['response'], end='', flush=True)
================================================
FILE: examples/generate.py
================================================
from ollama import generate
response = generate('gemma3', 'Why is the sky blue?')
print(response['response'])
================================================
FILE: examples/gpt-oss-tools-stream.py
================================================
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "gpt-oss",
# "ollama",
# "rich",
# ]
# ///
import random
from typing import Iterator
from rich import print
from ollama import Client
from ollama._types import ChatResponse
def get_weather(city: str) -> str:
"""
Get the current temperature for a city
Args:
city (str): The name of the city
Returns:
str: The current temperature
"""
temperatures = list(range(-10, 35))
temp = random.choice(temperatures)
return f'The temperature in {city} is {temp}°C'
def get_weather_conditions(city: str) -> str:
"""
Get the weather conditions for a city
Args:
city (str): The name of the city
Returns:
str: The current weather conditions
"""
conditions = ['sunny', 'cloudy', 'rainy', 'snowy', 'foggy']
return random.choice(conditions)
available_tools = {'get_weather': get_weather, 'get_weather_conditions': get_weather_conditions}
messages = [{'role': 'user', 'content': 'What is the weather like in London? What are the conditions in Toronto?'}]
client = Client(
# Ollama Turbo
# host="https://ollama.com", headers={'Authorization': (os.getenv('OLLAMA_API_KEY'))}
)
model = 'gpt-oss:20b'
# gpt-oss can call tools while "thinking"
# a loop is needed to call the tools and get the results
final = True
while True:
response_stream: Iterator[ChatResponse] = client.chat(model=model, messages=messages, tools=[get_weather, get_weather_conditions], stream=True)
tool_calls = []
thinking = ''
content = ''
for chunk in response_stream:
if chunk.message.tool_calls:
tool_calls.extend(chunk.message.tool_calls)
if chunk.message.content:
if not (chunk.message.thinking or chunk.message.thinking == '') and final:
print('\n\n' + '=' * 10)
print('Final result: ')
final = False
print(chunk.message.content, end='', flush=True)
if chunk.message.thinking:
# accumulate thinking
thinking += chunk.message.thinking
print(chunk.message.thinking, end='', flush=True)
if thinking != '' or content != '' or len(tool_calls) > 0:
messages.append({'role': 'assistant', 'thinking': thinking, 'content': content, 'tool_calls': tool_calls})
print()
if tool_calls:
for tool_call in tool_calls:
function_to_call = available_tools.get(tool_call.function.name)
if function_to_call:
print('\nCalling tool:', tool_call.function.name, 'with arguments: ', tool_call.function.arguments)
result = function_to_call(**tool_call.function.arguments)
print('Tool result: ', result + '\n')
result_message = {'role': 'tool', 'content': result, 'tool_name': tool_call.function.name}
messages.append(result_message)
else:
print(f'Tool {tool_call.function.name} not found')
messages.append({'role': 'tool', 'content': f'Tool {tool_call.function.name} not found', 'tool_name': tool_call.function.name})
else:
# no more tool calls, we can stop the loop
break
================================================
FILE: examples/gpt-oss-tools.py
================================================
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "gpt-oss",
# "ollama",
# "rich",
# ]
# ///
import random
from rich import print
from ollama import Client
from ollama._types import ChatResponse
def get_weather(city: str) -> str:
"""
Get the current temperature for a city
Args:
city (str): The name of the city
Returns:
str: The current temperature
"""
temperatures = list(range(-10, 35))
temp = random.choice(temperatures)
return f'The temperature in {city} is {temp}°C'
def get_weather_conditions(city: str) -> str:
"""
Get the weather conditions for a city
Args:
city (str): The name of the city
Returns:
str: The current weather conditions
"""
conditions = ['sunny', 'cloudy', 'rainy', 'snowy', 'foggy']
return random.choice(conditions)
available_tools = {'get_weather': get_weather, 'get_weather_conditions': get_weather_conditions}
messages = [{'role': 'user', 'content': 'What is the weather like in London? What are the conditions in Toronto?'}]
client = Client(
# Ollama Turbo
# host="https://ollama.com", headers={'Authorization': (os.getenv('OLLAMA_API_KEY'))}
)
model = 'gpt-oss:20b'
# gpt-oss can call tools while "thinking"
# a loop is needed to call the tools and get the results
while True:
response: ChatResponse = client.chat(model=model, messages=messages, tools=[get_weather, get_weather_conditions])
if response.message.content:
print('Content: ')
print(response.message.content + '\n')
if response.message.thinking:
print('Thinking: ')
print(response.message.thinking + '\n')
messages.append(response.message)
if response.message.tool_calls:
for tool_call in response.message.tool_calls:
function_to_call = available_tools.get(tool_call.function.name)
if function_to_call:
result = function_to_call(**tool_call.function.arguments)
print('Result from tool call name: ', tool_call.function.name, 'with arguments: ', tool_call.function.arguments, 'result: ', result + '\n')
messages.append({'role': 'tool', 'content': result, 'tool_name': tool_call.function.name})
else:
print(f'Tool {tool_call.function.name} not found')
messages.append({'role': 'tool', 'content': f'Tool {tool_call.function.name} not found', 'tool_name': tool_call.function.name})
else:
# no more tool calls, we can stop the loop
break
================================================
FILE: examples/list.py
================================================
from ollama import ListResponse, list
response: ListResponse = list()
for model in response.models:
print('Name:', model.model)
print(' Size (MB):', f'{(model.size.real / 1024 / 1024):.2f}')
if model.details:
print(' Format:', model.details.format)
print(' Family:', model.details.family)
print(' Parameter Size:', model.details.parameter_size)
print(' Quantization Level:', model.details.quantization_level)
print('\n')
================================================
FILE: examples/multi-tool.py
================================================
import random
from typing import Iterator
from ollama import ChatResponse, Client
def get_temperature(city: str) -> int:
"""
Get the temperature for a city in Celsius
Args:
city (str): The name of the city
Returns:
int: The current temperature in Celsius
"""
# This is a mock implementation - would need to use a real weather API
import random
if city not in ['London', 'Paris', 'New York', 'Tokyo', 'Sydney']:
return 'Unknown city'
return str(random.randint(0, 35)) + ' degrees Celsius'
def get_conditions(city: str) -> str:
"""
Get the weather conditions for a city
"""
if city not in ['London', 'Paris', 'New York', 'Tokyo', 'Sydney']:
return 'Unknown city'
# This is a mock implementation - would need to use a real weather API
conditions = ['sunny', 'cloudy', 'rainy', 'snowy']
return random.choice(conditions)
available_functions = {
'get_temperature': get_temperature,
'get_conditions': get_conditions,
}
cities = ['London', 'Paris', 'New York', 'Tokyo', 'Sydney']
city = random.choice(cities)
city2 = random.choice(cities)
messages = [{'role': 'user', 'content': f'What is the temperature in {city}? and what are the weather conditions in {city2}?'}]
print('----- Prompt:', messages[0]['content'], '\n')
model = 'qwen3'
client = Client()
response: Iterator[ChatResponse] = client.chat(model, stream=True, messages=messages, tools=[get_temperature, get_conditions], think=True)
for chunk in response:
if chunk.message.thinking:
print(chunk.message.thinking, end='', flush=True)
if chunk.message.content:
print(chunk.message.content, end='', flush=True)
if chunk.message.tool_calls:
for tool in chunk.message.tool_calls:
if function_to_call := available_functions.get(tool.function.name):
print('\nCalling function:', tool.function.name, 'with arguments:', tool.function.arguments)
output = function_to_call(**tool.function.arguments)
print('> Function output:', output, '\n')
# Add the assistant message and tool call result to the messages
messages.append(chunk.message)
messages.append({'role': 'tool', 'content': str(output), 'tool_name': tool.function.name})
else:
print('Function', tool.function.name, 'not found')
print('----- Sending result back to model \n')
if any(msg.get('role') == 'tool' for msg in messages):
res = client.chat(model, stream=True, tools=[get_temperature, get_conditions], messages=messages, think=True)
done_thinking = False
for chunk in res:
if chunk.message.thinking:
print(chunk.message.thinking, end='', flush=True)
if chunk.message.content:
if not done_thinking:
print('\n----- Final result:')
done_thinking = True
print(chunk.message.content, end='', flush=True)
if chunk.message.tool_calls:
# Model should be explaining the tool calls and the results in this output
print('Model returned tool calls:')
print(chunk.message.tool_calls)
else:
print('No tool calls returned')
================================================
FILE: examples/multimodal-chat.py
================================================
from ollama import chat
# from pathlib import Path
# Pass in the path to the image
path = input('Please enter the path to the image: ')
# You can also pass in base64 encoded image data
# img = base64.b64encode(Path(path).read_bytes()).decode()
# or the raw bytes
# img = Path(path).read_bytes()
response = chat(
model='gemma3',
messages=[
{
'role': 'user',
'content': 'What is in this image? Be concise.',
'images': [path],
}
],
)
print(response.message.content)
================================================
FILE: examples/multimodal-generate.py
================================================
import random
import sys
import httpx
from ollama import generate
latest = httpx.get('https://xkcd.com/info.0.json')
latest.raise_for_status()
num = int(sys.argv[1]) if len(sys.argv) > 1 else random.randint(1, latest.json().get('num'))
comic = httpx.get(f'https://xkcd.com/{num}/info.0.json')
comic.raise_for_status()
print(f'xkcd #{comic.json().get("num")}: {comic.json().get("alt")}')
print(f'link: https://xkcd.com/{num}')
print('---')
raw = httpx.get(comic.json().get('img'))
raw.raise_for_status()
for response in generate('llava', 'explain this comic:', images=[raw.content], stream=True):
print(response['response'], end='', flush=True)
print()
================================================
FILE: examples/ps.py
================================================
from ollama import ProcessResponse, chat, ps, pull
# Ensure at least one model is loaded
response = pull('gemma3', stream=True)
progress_states = set()
for progress in response:
if progress.get('status') in progress_states:
continue
progress_states.add(progress.get('status'))
print(progress.get('status'))
print('\n')
print('Waiting for model to load... \n')
chat(model='gemma3', messages=[{'role': 'user', 'content': 'Why is the sky blue?'}])
response: ProcessResponse = ps()
for model in response.models:
print('Model: ', model.model)
print(' Digest: ', model.digest)
print(' Expires at: ', model.expires_at)
print(' Size: ', model.size)
print(' Size vram: ', model.size_vram)
print(' Details: ', model.details)
print(' Context length: ', model.context_length)
print('\n')
================================================
FILE: examples/pull.py
================================================
from tqdm import tqdm
from ollama import pull
current_digest, bars = '', {}
for progress in pull('gemma3', stream=True):
digest = progress.get('digest', '')
if digest != current_digest and current_digest in bars:
bars[current_digest].close()
if not digest:
print(progress.get('status'))
continue
if digest not in bars and (total := progress.get('total')):
bars[digest] = tqdm(total=total, desc=f'pulling {digest[7:19]}', unit='B', unit_scale=True)
if completed := progress.get('completed'):
bars[digest].update(completed - bars[digest].n)
current_digest = digest
================================================
FILE: examples/show.py
================================================
from ollama import ShowResponse, show
response: ShowResponse = show('gemma3')
print('Model Information:')
print(f'Modified at: {response.modified_at}')
print(f'Template: {response.template}')
print(f'Modelfile: {response.modelfile}')
print(f'License: {response.license}')
print(f'Details: {response.details}')
print(f'Model Info: {response.modelinfo}')
print(f'Parameters: {response.parameters}')
print(f'Capabilities: {response.capabilities}')
================================================
FILE: examples/structured-outputs-image.py
================================================
from pathlib import Path
from typing import Literal
from pydantic import BaseModel
from ollama import chat
# Define the schema for image objects
class Object(BaseModel):
name: str
confidence: float
attributes: str
class ImageDescription(BaseModel):
summary: str
objects: list[Object]
scene: str
colors: list[str]
time_of_day: Literal['Morning', 'Afternoon', 'Evening', 'Night']
setting: Literal['Indoor', 'Outdoor', 'Unknown']
text_content: str | None = None
# Get path from user input
path = input('Enter the path to your image: ')
path = Path(path)
# Verify the file exists
if not path.exists():
raise FileNotFoundError(f'Image not found at: {path}')
# Set up chat as usual
response = chat(
model='gemma3',
format=ImageDescription.model_json_schema(), # Pass in the schema for the response
messages=[
{
'role': 'user',
'content': 'Analyze this image and return a detailed JSON description including objects, scene, colors and any text detected. If you cannot determine certain details, leave those fields empty.',
'images': [path],
},
],
options={'temperature': 0}, # Set temperature to 0 for more deterministic output
)
# Convert received content to the schema
image_analysis = ImageDescription.model_validate_json(response.message.content)
print(image_analysis)
================================================
FILE: examples/structured-outputs.py
================================================
from pydantic import BaseModel
from ollama import chat
# Define the schema for the response
class FriendInfo(BaseModel):
name: str
age: int
is_available: bool
class FriendList(BaseModel):
friends: list[FriendInfo]
# schema = {'type': 'object', 'properties': {'friends': {'type': 'array', 'items': {'type': 'object', 'properties': {'name': {'type': 'string'}, 'age': {'type': 'integer'}, 'is_available': {'type': 'boolean'}}, 'required': ['name', 'age', 'is_available']}}}, 'required': ['friends']}
response = chat(
model='llama3.1:8b',
messages=[{'role': 'user', 'content': 'I have two friends. The first is Ollama 22 years old busy saving the world, and the second is Alonso 23 years old and wants to hang out. Return a list of friends in JSON format'}],
format=FriendList.model_json_schema(), # Use Pydantic to generate the schema or format=schema
options={'temperature': 0}, # Make responses more deterministic
)
# Use Pydantic to validate the response
friends_response = FriendList.model_validate_json(response.message.content)
print(friends_response)
================================================
FILE: examples/thinking-generate.py
================================================
from ollama import generate
response = generate('deepseek-r1', 'why is the sky blue', think=True)
print('Thinking:\n========\n\n' + response.thinking)
print('\nResponse:\n========\n\n' + response.response)
================================================
FILE: examples/thinking-levels.py
================================================
from ollama import chat
def heading(text):
print(text)
print('=' * len(text))
messages = [
{'role': 'user', 'content': 'What is 10 + 23?'},
]
# gpt-oss supports 'low', 'medium', 'high'
levels = ['low', 'medium', 'high']
for i, level in enumerate(levels):
response = chat('gpt-oss:20b', messages=messages, think=level)
heading(f'Thinking ({level})')
print(response.message.thinking)
print('\n')
heading('Response')
print(response.message.content)
print('\n')
if i < len(levels) - 1:
print('-' * 20)
print('\n')
================================================
FILE: examples/thinking.py
================================================
from ollama import chat
messages = [
{
'role': 'user',
'content': 'What is 10 + 23?',
},
]
response = chat('deepseek-r1', messages=messages, think=True)
print('Thinking:\n========\n\n' + response.message.thinking)
print('\nResponse:\n========\n\n' + response.message.content)
================================================
FILE: examples/tools.py
================================================
from ollama import ChatResponse, chat
def add_two_numbers(a: int, b: int) -> int:
"""
Add two numbers
Args:
a (int): The first number
b (int): The second number
Returns:
int: The sum of the two numbers
"""
# The cast is necessary as returned tool call arguments don't always conform exactly to schema
# E.g. this would prevent "what is 30 + 12" to produce '3012' instead of 42
return int(a) + int(b)
def subtract_two_numbers(a: int, b: int) -> int:
"""
Subtract two numbers
"""
# The cast is necessary as returned tool call arguments don't always conform exactly to schema
return int(a) - int(b)
# Tools can still be manually defined and passed into chat
subtract_two_numbers_tool = {
'type': 'function',
'function': {
'name': 'subtract_two_numbers',
'description': 'Subtract two numbers',
'parameters': {
'type': 'object',
'required': ['a', 'b'],
'properties': {
'a': {'type': 'integer', 'description': 'The first number'},
'b': {'type': 'integer', 'description': 'The second number'},
},
},
},
}
messages = [{'role': 'user', 'content': 'What is three plus one?'}]
print('Prompt:', messages[0]['content'])
available_functions = {
'add_two_numbers': add_two_numbers,
'subtract_two_numbers': subtract_two_numbers,
}
response: ChatResponse = chat(
'llama3.1',
messages=messages,
tools=[add_two_numbers, subtract_two_numbers_tool],
)
if response.message.tool_calls:
# There may be multiple tool calls in the response
for tool in response.message.tool_calls:
# Ensure the function is available, and then call it
if function_to_call := available_functions.get(tool.function.name):
print('Calling function:', tool.function.name)
print('Arguments:', tool.function.arguments)
output = function_to_call(**tool.function.arguments)
print('Function output:', output)
else:
print('Function', tool.function.name, 'not found')
# Only needed to chat with the model using the tool call results
if response.message.tool_calls:
# Add the function response to messages for the model to use
messages.append(response.message)
messages.append({'role': 'tool', 'content': str(output), 'tool_name': tool.function.name})
# Get final response from model with function outputs
final_response = chat('llama3.1', messages=messages)
print('Final response:', final_response.message.content)
else:
print('No tool calls returned from model')
================================================
FILE: examples/web-search-gpt-oss.py
================================================
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "ollama",
# ]
# ///
from typing import Any, Dict, List
from web_search_gpt_oss_helper import Browser
from ollama import Client
def main() -> None:
client = Client()
browser = Browser(initial_state=None, client=client)
def browser_search(query: str, topn: int = 10) -> str:
return browser.search(query=query, topn=topn)['pageText']
def browser_open(id: int | str | None = None, cursor: int = -1, loc: int = -1, num_lines: int = -1) -> str:
return browser.open(id=id, cursor=cursor, loc=loc, num_lines=num_lines)['pageText']
def browser_find(pattern: str, cursor: int = -1, **_: Any) -> str:
return browser.find(pattern=pattern, cursor=cursor)['pageText']
browser_search_schema = {
'type': 'function',
'function': {
'name': 'browser.search',
},
}
browser_open_schema = {
'type': 'function',
'function': {
'name': 'browser.open',
},
}
browser_find_schema = {
'type': 'function',
'function': {
'name': 'browser.find',
},
}
available_tools = {
'browser.search': browser_search,
'browser.open': browser_open,
'browser.find': browser_find,
}
query = "what is ollama's new engine"
print('Prompt:', query, '\n')
messages: List[Dict[str, Any]] = [{'role': 'user', 'content': query}]
while True:
resp = client.chat(
model='gpt-oss:120b-cloud',
messages=messages,
tools=[browser_search_schema, browser_open_schema, browser_find_schema],
think=True,
)
if resp.message.thinking:
print('Thinking:\n========\n')
print(resp.message.thinking + '\n')
if resp.message.content:
print('Response:\n========\n')
print(resp.message.content + '\n')
messages.append(resp.message)
if not resp.message.tool_calls:
break
for tc in resp.message.tool_calls:
tool_name = tc.function.name
args = tc.function.arguments or {}
print(f'Tool name: {tool_name}, args: {args}')
fn = available_tools.get(tool_name)
if not fn:
messages.append({'role': 'tool', 'content': f'Tool {tool_name} not found', 'tool_name': tool_name})
continue
try:
result_text = fn(**args)
print('Result: ', result_text[:200] + '...')
except Exception as e:
result_text = f'Error from {tool_name}: {e}'
messages.append({'role': 'tool', 'content': result_text, 'tool_name': tool_name})
if __name__ == '__main__':
main()
================================================
FILE: examples/web-search-mcp.py
================================================
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "mcp",
# "rich",
# "ollama",
# ]
# ///
"""
MCP stdio server exposing Ollama web_search and web_fetch as tools.
Environment:
- OLLAMA_API_KEY (required): if set, will be used as Authorization header.
"""
from __future__ import annotations
import asyncio
from typing import Any, Dict
from ollama import Client
try:
# Preferred high-level API (if available)
from mcp.server.fastmcp import FastMCP # type: ignore
_FASTMCP_AVAILABLE = True
except Exception:
_FASTMCP_AVAILABLE = False
if not _FASTMCP_AVAILABLE:
# Fallback to the low-level stdio server API
from mcp.server import Server # type: ignore
from mcp.server.stdio import stdio_server # type: ignore
client = Client()
def _web_search_impl(query: str, max_results: int = 3) -> Dict[str, Any]:
res = client.web_search(query=query, max_results=max_results)
return res.model_dump()
def _web_fetch_impl(url: str) -> Dict[str, Any]:
res = client.web_fetch(url=url)
return res.model_dump()
if _FASTMCP_AVAILABLE:
app = FastMCP('ollama-search-fetch')
@app.tool()
def web_search(query: str, max_results: int = 3) -> Dict[str, Any]:
"""
Perform a web search using Ollama's hosted search API.
Args:
query: The search query to run.
max_results: Maximum results to return (default: 3).
Returns:
JSON-serializable dict matching ollama.WebSearchResponse.model_dump()
"""
return _web_search_impl(query=query, max_results=max_results)
@app.tool()
def web_fetch(url: str) -> Dict[str, Any]:
"""
Fetch the content of a web page for the provided URL.
Args:
url: The absolute URL to fetch.
Returns:
JSON-serializable dict matching ollama.WebFetchResponse.model_dump()
"""
return _web_fetch_impl(url=url)
if __name__ == '__main__':
app.run()
else:
server = Server('ollama-search-fetch') # type: ignore[name-defined]
@server.tool() # type: ignore[attr-defined]
async def web_search(query: str, max_results: int = 3) -> Dict[str, Any]:
"""
Perform a web search using Ollama's hosted search API.
Args:
query: The search query to run.
max_results: Maximum results to return (default: 3).
"""
return await asyncio.to_thread(_web_search_impl, query, max_results)
@server.tool() # type: ignore[attr-defined]
async def web_fetch(url: str) -> Dict[str, Any]:
"""
Fetch the content of a web page for the provided URL.
Args:
url: The absolute URL to fetch.
"""
return await asyncio.to_thread(_web_fetch_impl, url)
async def _main() -> None:
async with stdio_server() as (read, write): # type: ignore[name-defined]
await server.run(read, write) # type: ignore[attr-defined]
if __name__ == '__main__':
asyncio.run(_main())
================================================
FILE: examples/web-search.py
================================================
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "rich",
# "ollama",
# ]
# ///
from typing import Union
from rich import print
from ollama import WebFetchResponse, WebSearchResponse, chat, web_fetch, web_search
def format_tool_results(
results: Union[WebSearchResponse, WebFetchResponse],
user_search: str,
):
output = []
if isinstance(results, WebSearchResponse):
output.append(f'Search results for "{user_search}":')
for result in results.results:
output.append(f'{result.title}' if result.title else f'{result.content}')
output.append(f' URL: {result.url}')
output.append(f' Content: {result.content}')
output.append('')
return '\n'.join(output).rstrip()
elif isinstance(results, WebFetchResponse):
output.append(f'Fetch results for "{user_search}":')
output.extend(
[
f'Title: {results.title}',
f'URL: {user_search}' if user_search else '',
f'Content: {results.content}',
]
)
if results.links:
output.append(f'Links: {", ".join(results.links)}')
output.append('')
return '\n'.join(output).rstrip()
# client = Client(headers={'Authorization': f"Bearer {os.getenv('OLLAMA_API_KEY')}"} if api_key else None)
available_tools = {'web_search': web_search, 'web_fetch': web_fetch}
query = "what is ollama's new engine"
print('Query: ', query)
messages = [{'role': 'user', 'content': query}]
while True:
response = chat(model='qwen3', messages=messages, tools=[web_search, web_fetch], think=True)
if response.message.thinking:
print('Thinking: ')
print(response.message.thinking + '\n\n')
if response.message.content:
print('Content: ')
print(response.message.content + '\n')
messages.append(response.message)
if response.message.tool_calls:
for tool_call in response.message.tool_calls:
function_to_call = available_tools.get(tool_call.function.name)
if function_to_call:
args = tool_call.function.arguments
result: Union[WebSearchResponse, WebFetchResponse] = function_to_call(**args)
print('Result from tool call name:', tool_call.function.name, 'with arguments:')
print(args)
print()
user_search = args.get('query', '') or args.get('url', '')
formatted_tool_results = format_tool_results(result, user_search=user_search)
print(formatted_tool_results[:300])
print()
# caps the result at ~2000 tokens
messages.append({'role': 'tool', 'content': formatted_tool_results[: 2000 * 4], 'tool_name': tool_call.function.name})
else:
print(f'Tool {tool_call.function.name} not found')
messages.append({'role': 'tool', 'content': f'Tool {tool_call.function.name} not found', 'tool_name': tool_call.function.name})
else:
# no more tool calls, we can stop the loop
break
================================================
FILE: examples/web_search_gpt_oss_helper.py
================================================
from __future__ import annotations
import re
from dataclasses import dataclass, field
from datetime import datetime
from typing import Any, Dict, List, Optional, Protocol, Tuple
from urllib.parse import urlparse
from ollama import Client
@dataclass
class Page:
url: str
title: str
text: str
lines: List[str]
links: Dict[int, str]
fetched_at: datetime
@dataclass
class BrowserStateData:
page_stack: List[str] = field(default_factory=list)
view_tokens: int = 1024
url_to_page: Dict[str, Page] = field(default_factory=dict)
@dataclass
class WebSearchResult:
title: str
url: str
content: Dict[str, str]
class SearchClient(Protocol):
def search(self, queries: List[str], max_results: Optional[int] = None): ...
class CrawlClient(Protocol):
def crawl(self, urls: List[str]): ...
# ---- Constants ---------------------------------------------------------------
DEFAULT_VIEW_TOKENS = 1024
CAPPED_TOOL_CONTENT_LEN = 8000
# ---- Helpers ----------------------------------------------------------------
def cap_tool_content(text: str) -> str:
if not text:
return text
if len(text) <= CAPPED_TOOL_CONTENT_LEN:
return text
if CAPPED_TOOL_CONTENT_LEN <= 1:
return text[:CAPPED_TOOL_CONTENT_LEN]
return text[: CAPPED_TOOL_CONTENT_LEN - 1] + '…'
def _safe_domain(u: str) -> str:
try:
parsed = urlparse(u)
host = parsed.netloc or u
return host.replace('www.', '') if host else u
except Exception:
return u
# ---- BrowserState ------------------------------------------------------------
class BrowserState:
def __init__(self, initial_state: Optional[BrowserStateData] = None):
self._data = initial_state or BrowserStateData(view_tokens=DEFAULT_VIEW_TOKENS)
def get_data(self) -> BrowserStateData:
return self._data
def set_data(self, data: BrowserStateData) -> None:
self._data = data
# ---- Browser ----------------------------------------------------------------
class Browser:
def __init__(
self,
initial_state: Optional[BrowserStateData] = None,
client: Optional[Client] = None,
):
self.state = BrowserState(initial_state)
self._client: Optional[Client] = client
def set_client(self, client: Client) -> None:
self._client = client
def get_state(self) -> BrowserStateData:
return self.state.get_data()
# ---- internal utils ----
def _save_page(self, page: Page) -> None:
data = self.state.get_data()
data.url_to_page[page.url] = page
data.page_stack.append(page.url)
self.state.set_data(data)
def _page_from_stack(self, url: str) -> Page:
data = self.state.get_data()
page = data.url_to_page.get(url)
if not page:
raise ValueError(f'Page not found for url {url}')
return page
def _join_lines_with_numbers(self, lines: List[str]) -> str:
result = []
for i, line in enumerate(lines):
result.append(f'L{i}: {line}')
return '\n'.join(result)
def _wrap_lines(self, text: str, width: int = 80) -> List[str]:
if width <= 0:
width = 80
src_lines = text.split('\n')
wrapped: List[str] = []
for line in src_lines:
if line == '':
wrapped.append('')
elif len(line) <= width:
wrapped.append(line)
else:
words = re.split(r'\s+', line)
if not words:
wrapped.append(line)
continue
curr = ''
for w in words:
test = (curr + ' ' + w) if curr else w
if len(test) > width and curr:
wrapped.append(curr)
curr = w
else:
curr = test
if curr:
wrapped.append(curr)
return wrapped
def _process_markdown_links(self, text: str) -> Tuple[str, Dict[int, str]]:
links: Dict[int, str] = {}
link_id = 0
multiline_pattern = re.compile(r'\[([^\]]+)\]\s*\n\s*\(([^)]+)\)')
text = multiline_pattern.sub(lambda m: f'[{m.group(1)}]({m.group(2)})', text)
text = re.sub(r'\s+', ' ', text)
link_pattern = re.compile(r'\[([^\]]+)\]\(([^)]+)\)')
def _repl(m: re.Match) -> str:
nonlocal link_id
link_text = m.group(1).strip()
link_url = m.group(2).strip()
domain = _safe_domain(link_url)
formatted = f'【{link_id}†{link_text}†{domain}】'
links[link_id] = link_url
link_id += 1
return formatted
processed = link_pattern.sub(_repl, text)
return processed, links
def _get_end_loc(self, loc: int, num_lines: int, total_lines: int, lines: List[str]) -> int:
if num_lines <= 0:
txt = self._join_lines_with_numbers(lines[loc:])
data = self.state.get_data()
chars_per_token = 4
max_chars = min(data.view_tokens * chars_per_token, len(txt))
num_lines = txt[:max_chars].count('\n') + 1
return min(loc + num_lines, total_lines)
def _display_page(self, page: Page, cursor: int, loc: int, num_lines: int) -> str:
total_lines = len(page.lines) or 0
if total_lines == 0:
page.lines = ['']
total_lines = 1
if loc != loc or loc < 0:
loc = 0
elif loc >= total_lines:
loc = max(0, total_lines - 1)
end_loc = self._get_end_loc(loc, num_lines, total_lines, page.lines)
header = f'[{cursor}] {page.title}'
header += f'({page.url})\n' if page.url else '\n'
header += f'**viewing lines [{loc} - {end_loc - 1}] of {total_lines - 1}**\n\n'
body_lines = []
for i in range(loc, end_loc):
body_lines.append(f'L{i}: {page.lines[i]}')
return header + '\n'.join(body_lines)
# ---- page builders ----
def _build_search_results_page_collection(self, query: str, results: Dict[str, Any]) -> Page:
page = Page(
url=f'search_results_{query}',
title=query,
text='',
lines=[],
links={},
fetched_at=datetime.utcnow(),
)
tb = []
tb.append('')
tb.append('# Search Results')
tb.append('')
link_idx = 0
for query_results in results.get('results', {}).values():
for result in query_results:
domain = _safe_domain(result.get('url', ''))
link_fmt = f'* 【{link_idx}†{result.get("title", "")}†{domain}】'
tb.append(link_fmt)
raw_snip = result.get('content') or ''
capped = (raw_snip[:400] + '…') if len(raw_snip) > 400 else raw_snip
cleaned = re.sub(r'\d{40,}', lambda m: m.group(0)[:40] + '…', capped)
cleaned = re.sub(r'\s{3,}', ' ', cleaned)
tb.append(cleaned)
page.links[link_idx] = result.get('url', '')
link_idx += 1
page.text = '\n'.join(tb)
page.lines = self._wrap_lines(page.text, 80)
return page
def _build_search_result_page(self, result: WebSearchResult, link_idx: int) -> Page:
page = Page(
url=result.url,
title=result.title,
text='',
lines=[],
links={},
fetched_at=datetime.utcnow(),
)
link_fmt = f'【{link_idx}†{result.title}】\n'
preview = link_fmt + f'URL: {result.url}\n'
full_text = result.content.get('fullText', '') if result.content else ''
preview += full_text[:300] + '\n\n'
if not full_text:
page.links[link_idx] = result.url
if full_text:
raw = f'URL: {result.url}\n{full_text}'
processed, links = self._process_markdown_links(raw)
page.text = processed
page.links = links
else:
page.text = preview
page.lines = self._wrap_lines(page.text, 80)
return page
def _build_page_from_fetch(self, requested_url: str, fetch_response: Dict[str, Any]) -> Page:
page = Page(
url=requested_url,
title=requested_url,
text='',
lines=[],
links={},
fetched_at=datetime.utcnow(),
)
for url, url_results in fetch_response.get('results', {}).items():
if url_results:
r0 = url_results[0]
if r0.get('content'):
page.text = r0['content']
if r0.get('title'):
page.title = r0['title']
page.url = url
break
if not page.text:
page.text = 'No content could be extracted from this page.'
else:
page.text = f'URL: {page.url}\n{page.text}'
processed, links = self._process_markdown_links(page.text)
page.text = processed
page.links = links
page.lines = self._wrap_lines(page.text, 80)
return page
def _build_find_results_page(self, pattern: str, page: Page) -> Page:
find_page = Page(
url=f'find_results_{pattern}',
title=f'Find results for text: `{pattern}` in `{page.title}`',
text='',
lines=[],
links={},
fetched_at=datetime.utcnow(),
)
max_results = 50
num_show_lines = 4
pattern_lower = pattern.lower()
result_chunks: List[str] = []
line_idx = 0
while line_idx < len(page.lines):
line = page.lines[line_idx]
if pattern_lower not in line.lower():
line_idx += 1
continue
end_line = min(line_idx + num_show_lines, len(page.lines))
snippet = '\n'.join(page.lines[line_idx:end_line])
link_fmt = f'【{len(result_chunks)}†match at L{line_idx}】'
result_chunks.append(f'{link_fmt}\n{snippet}')
if len(result_chunks) >= max_results:
break
line_idx += num_show_lines
if not result_chunks:
find_page.text = f'No `find` results for pattern: `{pattern}`'
else:
find_page.text = '\n\n'.join(result_chunks)
find_page.lines = self._wrap_lines(find_page.text, 80)
return find_page
# ---- public API: search / open / find ------------------------------------
def search(self, *, query: str, topn: int = 5) -> Dict[str, Any]:
if not self._client:
raise RuntimeError('Client not provided')
resp = self._client.web_search(query, max_results=topn)
normalized: Dict[str, Any] = {'results': {}}
rows: List[Dict[str, str]] = []
for item in resp.results:
content = item.content or ''
rows.append(
{
'title': item.title,
'url': item.url,
'content': content,
}
)
normalized['results'][query] = rows
search_page = self._build_search_results_page_collection(query, normalized)
self._save_page(search_page)
cursor = len(self.get_state().page_stack) - 1
for query_results in normalized.get('results', {}).values():
for i, r in enumerate(query_results):
ws = WebSearchResult(
title=r.get('title', ''),
url=r.get('url', ''),
content={'fullText': r.get('content', '') or ''},
)
result_page = self._build_search_result_page(ws, i + 1)
data = self.get_state()
data.url_to_page[result_page.url] = result_page
self.state.set_data(data)
page_text = self._display_page(search_page, cursor, loc=0, num_lines=-1)
return {'state': self.get_state(), 'pageText': cap_tool_content(page_text)}
def open(
self,
*,
id: Optional[str | int] = None,
cursor: int = -1,
loc: int = 0,
num_lines: int = -1,
) -> Dict[str, Any]:
if not self._client:
raise RuntimeError('Client not provided')
state = self.get_state()
if isinstance(id, str):
url = id
if url in state.url_to_page:
self._save_page(state.url_to_page[url])
cursor = len(self.get_state().page_stack) - 1
page_text = self._display_page(state.url_to_page[url], cursor, loc, num_lines)
return {'state': self.get_state(), 'pageText': cap_tool_content(page_text)}
fetch_response = self._client.web_fetch(url)
normalized: Dict[str, Any] = {
'results': {
url: [
{
'title': fetch_response.title or url,
'url': url,
'content': fetch_response.content or '',
}
]
}
}
new_page = self._build_page_from_fetch(url, normalized)
self._save_page(new_page)
cursor = len(self.get_state().page_stack) - 1
page_text = self._display_page(new_page, cursor, loc, num_lines)
return {'state': self.get_state(), 'pageText': cap_tool_content(page_text)}
# Resolve current page from stack only if needed (int id or no id)
page: Optional[Page] = None
if cursor >= 0:
if state.page_stack:
if cursor >= len(state.page_stack):
cursor = max(0, len(state.page_stack) - 1)
page = self._page_from_stack(state.page_stack[cursor])
else:
page = None
else:
if state.page_stack:
page = self._page_from_stack(state.page_stack[-1])
if isinstance(id, int):
if not page:
raise RuntimeError('No current page to resolve link from')
link_url = page.links.get(id)
if not link_url:
err = Page(
url=f'invalid_link_{id}',
title=f'No link with id {id} on `{page.title}`',
text='',
lines=[],
links={},
fetched_at=datetime.utcnow(),
)
available = sorted(page.links.keys())
available_list = ', '.join(map(str, available)) if available else '(none)'
err.text = '\n'.join(
[
f'Requested link id: {id}',
f'Current page: {page.title}',
f'Available link ids on this page: {available_list}',
'',
'Tips:',
'- To scroll this page, call browser_open with { loc, num_lines } (no id).',
'- To open a result from a search results page, pass the correct { cursor, id }.',
]
)
err.lines = self._wrap_lines(err.text, 80)
self._save_page(err)
cursor = len(self.get_state().page_stack) - 1
page_text = self._display_page(err, cursor, 0, -1)
return {'state': self.get_state(), 'pageText': cap_tool_content(page_text)}
new_page = state.url_to_page.get(link_url)
if not new_page:
fetch_response = self._client.web_fetch(link_url)
normalized: Dict[str, Any] = {
'results': {
link_url: [
{
'title': fetch_response.title or link_url,
'url': link_url,
'content': fetch_response.content or '',
}
]
}
}
new_page = self._build_page_from_fetch(link_url, normalized)
self._save_page(new_page)
cursor = len(self.get_state().page_stack) - 1
page_text = self._display_page(new_page, cursor, loc, num_lines)
return {'state': self.get_state(), 'pageText': cap_tool_content(page_text)}
if not page:
raise RuntimeError('No current page to display')
cur = self.get_state()
cur.page_stack.append(page.url)
self.state.set_data(cur)
cursor = len(cur.page_stack) - 1
page_text = self._display_page(page, cursor, loc, num_lines)
return {'state': self.get_state(), 'pageText': cap_tool_content(page_text)}
def find(self, *, pattern: str, cursor: int = -1) -> Dict[str, Any]:
state = self.get_state()
if cursor == -1:
if not state.page_stack:
raise RuntimeError('No pages to search in')
page = self._page_from_stack(state.page_stack[-1])
cursor = len(state.page_stack) - 1
else:
if cursor < 0 or cursor >= len(state.page_stack):
cursor = max(0, min(cursor, len(state.page_stack) - 1))
page = self._page_from_stack(state.page_stack[cursor])
find_page = self._build_find_results_page(pattern, page)
self._save_page(find_page)
new_cursor = len(self.get_state().page_stack) - 1
page_text = self._display_page(find_page, new_cursor, 0, -1)
return {'state': self.get_state(), 'pageText': cap_tool_content(page_text)}
================================================
FILE: ollama/__init__.py
================================================
from ollama._client import AsyncClient, Client
from ollama._types import (
ChatResponse,
EmbeddingsResponse,
EmbedResponse,
GenerateResponse,
Image,
ListResponse,
Message,
Options,
ProcessResponse,
ProgressResponse,
RequestError,
ResponseError,
ShowResponse,
StatusResponse,
Tool,
WebFetchResponse,
WebSearchResponse,
)
__all__ = [
'AsyncClient',
'ChatResponse',
'Client',
'EmbedResponse',
'EmbeddingsResponse',
'GenerateResponse',
'Image',
'ListResponse',
'Message',
'Options',
'ProcessResponse',
'ProgressResponse',
'RequestError',
'ResponseError',
'ShowResponse',
'StatusResponse',
'Tool',
'WebFetchResponse',
'WebSearchResponse',
]
_client = Client()
generate = _client.generate
chat = _client.chat
embed = _client.embed
embeddings = _client.embeddings
pull = _client.pull
push = _client.push
create = _client.create
delete = _client.delete
list = _client.list
copy = _client.copy
show = _client.show
ps = _client.ps
web_search = _client.web_search
web_fetch = _client.web_fetch
================================================
FILE: ollama/_client.py
================================================
import contextlib
import ipaddress
import json
import os
import platform
import sys
import urllib.parse
from hashlib import sha256
from os import PathLike
from pathlib import Path
from typing import (
Any,
Callable,
Dict,
List,
Literal,
Mapping,
Optional,
Sequence,
Type,
TypeVar,
Union,
overload,
)
import anyio
from pydantic.json_schema import JsonSchemaValue
from ollama._utils import convert_function_to_tool
if sys.version_info < (3, 9):
from typing import AsyncIterator, Iterator
else:
from collections.abc import AsyncIterator, Iterator
from importlib import metadata
try:
__version__ = metadata.version('ollama')
except metadata.PackageNotFoundError:
__version__ = '0.0.0'
import httpx
from ollama._types import (
ChatRequest,
ChatResponse,
CopyRequest,
CreateRequest,
DeleteRequest,
EmbeddingsRequest,
EmbeddingsResponse,
EmbedRequest,
EmbedResponse,
GenerateRequest,
GenerateResponse,
Image,
ListResponse,
Message,
Options,
ProcessResponse,
ProgressResponse,
PullRequest,
PushRequest,
ResponseError,
ShowRequest,
ShowResponse,
StatusResponse,
Tool,
WebFetchRequest,
WebFetchResponse,
WebSearchRequest,
WebSearchResponse,
)
T = TypeVar('T')
class BaseClient(contextlib.AbstractContextManager, contextlib.AbstractAsyncContextManager):
def __init__(
self,
client,
host: Optional[str] = None,
*,
follow_redirects: bool = True,
timeout: Any = None,
headers: Optional[Mapping[str, str]] = None,
**kwargs,
) -> None:
"""
Creates a httpx client. Default parameters are the same as those defined in httpx
except for the following:
- `follow_redirects`: True
- `timeout`: None
`kwargs` are passed to the httpx client.
"""
headers = {
k.lower(): v
for k, v in {
**(headers or {}),
'Content-Type': 'application/json',
'Accept': 'application/json',
'User-Agent': f'ollama-python/{__version__} ({platform.machine()} {platform.system().lower()}) Python/{platform.python_version()}',
}.items()
if v is not None
}
api_key = os.getenv('OLLAMA_API_KEY', None)
if not headers.get('authorization') and api_key:
headers['authorization'] = f'Bearer {api_key}'
self._client = client(
base_url=_parse_host(host or os.getenv('OLLAMA_HOST')),
follow_redirects=follow_redirects,
timeout=timeout,
headers=headers,
**kwargs,
)
def __exit__(self, exc_type, exc_val, exc_tb):
self.close()
async def __aexit__(self, exc_type, exc_val, exc_tb):
await self.close()
CONNECTION_ERROR_MESSAGE = 'Failed to connect to Ollama. Please check that Ollama is downloaded, running and accessible. https://ollama.com/download'
class Client(BaseClient):
def __init__(self, host: Optional[str] = None, **kwargs) -> None:
super().__init__(httpx.Client, host, **kwargs)
def close(self):
self._client.close()
def _request_raw(self, *args, **kwargs):
try:
r = self._client.request(*args, **kwargs)
r.raise_for_status()
return r
except httpx.HTTPStatusError as e:
raise ResponseError(e.response.text, e.response.status_code) from None
except httpx.ConnectError:
raise ConnectionError(CONNECTION_ERROR_MESSAGE) from None
@overload
def _request(
self,
cls: Type[T],
*args,
stream: Literal[False] = False,
**kwargs,
) -> T: ...
@overload
def _request(
self,
cls: Type[T],
*args,
stream: Literal[True] = True,
**kwargs,
) -> Iterator[T]: ...
@overload
def _request(
self,
cls: Type[T],
*args,
stream: bool = False,
**kwargs,
) -> Union[T, Iterator[T]]: ...
def _request(
self,
cls: Type[T],
*args,
stream: bool = False,
**kwargs,
) -> Union[T, Iterator[T]]:
if stream:
def inner():
with self._client.stream(*args, **kwargs) as r:
try:
r.raise_for_status()
except httpx.HTTPStatusError as e:
e.response.read()
raise ResponseError(e.response.text, e.response.status_code) from None
for line in r.iter_lines():
part = json.loads(line)
if err := part.get('error'):
raise ResponseError(err)
yield cls(**part)
return inner()
return cls(**self._request_raw(*args, **kwargs).json())
@overload
def generate(
self,
model: str = '',
prompt: str = '',
suffix: str = '',
*,
system: str = '',
template: str = '',
context: Optional[Sequence[int]] = None,
stream: Literal[False] = False,
think: Optional[bool] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
raw: bool = False,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
images: Optional[Sequence[Union[str, bytes, Image]]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
width: Optional[int] = None,
height: Optional[int] = None,
steps: Optional[int] = None,
) -> GenerateResponse: ...
@overload
def generate(
self,
model: str = '',
prompt: str = '',
suffix: str = '',
*,
system: str = '',
template: str = '',
context: Optional[Sequence[int]] = None,
stream: Literal[True] = True,
think: Optional[bool] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
raw: bool = False,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
images: Optional[Sequence[Union[str, bytes, Image]]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
width: Optional[int] = None,
height: Optional[int] = None,
steps: Optional[int] = None,
) -> Iterator[GenerateResponse]: ...
def generate(
self,
model: str = '',
prompt: Optional[str] = None,
suffix: Optional[str] = None,
*,
system: Optional[str] = None,
template: Optional[str] = None,
context: Optional[Sequence[int]] = None,
stream: bool = False,
think: Optional[bool] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
raw: Optional[bool] = None,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
images: Optional[Sequence[Union[str, bytes, Image]]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
width: Optional[int] = None,
height: Optional[int] = None,
steps: Optional[int] = None,
) -> Union[GenerateResponse, Iterator[GenerateResponse]]:
"""
Create a response using the requested model.
Raises `RequestError` if a model is not provided.
Raises `ResponseError` if the request could not be fulfilled.
Returns `GenerateResponse` if `stream` is `False`, otherwise returns a `GenerateResponse` generator.
"""
return self._request(
GenerateResponse,
'POST',
'/api/generate',
json=GenerateRequest(
model=model,
prompt=prompt,
suffix=suffix,
system=system,
template=template,
context=context,
stream=stream,
think=think,
logprobs=logprobs,
top_logprobs=top_logprobs,
raw=raw,
format=format,
images=list(_copy_images(images)) if images else None,
options=options,
keep_alive=keep_alive,
width=width,
height=height,
steps=steps,
).model_dump(exclude_none=True),
stream=stream,
)
@overload
def chat(
self,
model: str = '',
messages: Optional[Sequence[Union[Mapping[str, Any], Message]]] = None,
*,
tools: Optional[Sequence[Union[Mapping[str, Any], Tool, Callable]]] = None,
stream: Literal[False] = False,
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
) -> ChatResponse: ...
@overload
def chat(
self,
model: str = '',
messages: Optional[Sequence[Union[Mapping[str, Any], Message]]] = None,
*,
tools: Optional[Sequence[Union[Mapping[str, Any], Tool, Callable]]] = None,
stream: Literal[True] = True,
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
) -> Iterator[ChatResponse]: ...
def chat(
self,
model: str = '',
messages: Optional[Sequence[Union[Mapping[str, Any], Message]]] = None,
*,
tools: Optional[Sequence[Union[Mapping[str, Any], Tool, Callable]]] = None,
stream: bool = False,
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
) -> Union[ChatResponse, Iterator[ChatResponse]]:
"""
Create a chat response using the requested model.
Args:
tools:
A JSON schema as a dict, an Ollama Tool or a Python Function.
Python functions need to follow Google style docstrings to be converted to an Ollama Tool.
For more information, see: https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings
stream: Whether to stream the response.
format: The format of the response.
Example:
def add_two_numbers(a: int, b: int) -> int:
'''
Add two numbers together.
Args:
a: First number to add
b: Second number to add
Returns:
int: The sum of a and b
'''
return a + b
client.chat(model='llama3.2', tools=[add_two_numbers], messages=[...])
Raises `RequestError` if a model is not provided.
Raises `ResponseError` if the request could not be fulfilled.
Returns `ChatResponse` if `stream` is `False`, otherwise returns a `ChatResponse` generator.
"""
return self._request(
ChatResponse,
'POST',
'/api/chat',
json=ChatRequest(
model=model,
messages=list(_copy_messages(messages)),
tools=list(_copy_tools(tools)),
stream=stream,
think=think,
logprobs=logprobs,
top_logprobs=top_logprobs,
format=format,
options=options,
keep_alive=keep_alive,
).model_dump(exclude_none=True),
stream=stream,
)
def embed(
self,
model: str = '',
input: Union[str, Sequence[str]] = '',
truncate: Optional[bool] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
dimensions: Optional[int] = None,
) -> EmbedResponse:
return self._request(
EmbedResponse,
'POST',
'/api/embed',
json=EmbedRequest(
model=model,
input=input,
truncate=truncate,
options=options,
keep_alive=keep_alive,
dimensions=dimensions,
).model_dump(exclude_none=True),
)
def embeddings(
self,
model: str = '',
prompt: Optional[str] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
) -> EmbeddingsResponse:
"""
Deprecated in favor of `embed`.
"""
return self._request(
EmbeddingsResponse,
'POST',
'/api/embeddings',
json=EmbeddingsRequest(
model=model,
prompt=prompt,
options=options,
keep_alive=keep_alive,
).model_dump(exclude_none=True),
)
@overload
def pull(
self,
model: str,
*,
insecure: bool = False,
stream: Literal[False] = False,
) -> ProgressResponse: ...
@overload
def pull(
self,
model: str,
*,
insecure: bool = False,
stream: Literal[True] = True,
) -> Iterator[ProgressResponse]: ...
def pull(
self,
model: str,
*,
insecure: bool = False,
stream: bool = False,
) -> Union[ProgressResponse, Iterator[ProgressResponse]]:
"""
Raises `ResponseError` if the request could not be fulfilled.
Returns `ProgressResponse` if `stream` is `False`, otherwise returns a `ProgressResponse` generator.
"""
return self._request(
ProgressResponse,
'POST',
'/api/pull',
json=PullRequest(
model=model,
insecure=insecure,
stream=stream,
).model_dump(exclude_none=True),
stream=stream,
)
@overload
def push(
self,
model: str,
*,
insecure: bool = False,
stream: Literal[False] = False,
) -> ProgressResponse: ...
@overload
def push(
self,
model: str,
*,
insecure: bool = False,
stream: Literal[True] = True,
) -> Iterator[ProgressResponse]: ...
def push(
self,
model: str,
*,
insecure: bool = False,
stream: bool = False,
) -> Union[ProgressResponse, Iterator[ProgressResponse]]:
"""
Raises `ResponseError` if the request could not be fulfilled.
Returns `ProgressResponse` if `stream` is `False`, otherwise returns a `ProgressResponse` generator.
"""
return self._request(
ProgressResponse,
'POST',
'/api/push',
json=PushRequest(
model=model,
insecure=insecure,
stream=stream,
).model_dump(exclude_none=True),
stream=stream,
)
@overload
def create(
self,
model: str,
quantize: Optional[str] = None,
from_: Optional[str] = None,
files: Optional[Dict[str, str]] = None,
adapters: Optional[Dict[str, str]] = None,
template: Optional[str] = None,
license: Optional[Union[str, List[str]]] = None,
system: Optional[str] = None,
parameters: Optional[Union[Mapping[str, Any], Options]] = None,
messages: Optional[Sequence[Union[Mapping[str, Any], Message]]] = None,
*,
stream: Literal[False] = False,
) -> ProgressResponse: ...
@overload
def create(
self,
model: str,
quantize: Optional[str] = None,
from_: Optional[str] = None,
files: Optional[Dict[str, str]] = None,
adapters: Optional[Dict[str, str]] = None,
template: Optional[str] = None,
license: Optional[Union[str, List[str]]] = None,
system: Optional[str] = None,
parameters: Optional[Union[Mapping[str, Any], Options]] = None,
messages: Optional[Sequence[Union[Mapping[str, Any], Message]]] = None,
*,
stream: Literal[True] = True,
) -> Iterator[ProgressResponse]: ...
def create(
self,
model: str,
quantize: Optional[str] = None,
from_: Optional[str] = None,
files: Optional[Dict[str, str]] = None,
adapters: Optional[Dict[str, str]] = None,
template: Optional[str] = None,
license: Optional[Union[str, List[str]]] = None,
system: Optional[str] = None,
parameters: Optional[Union[Mapping[str, Any], Options]] = None,
messages: Optional[Sequence[Union[Mapping[str, Any], Message]]] = None,
*,
stream: bool = False,
) -> Union[ProgressResponse, Iterator[ProgressResponse]]:
"""
Raises `ResponseError` if the request could not be fulfilled.
Returns `ProgressResponse` if `stream` is `False`, otherwise returns a `ProgressResponse` generator.
"""
return self._request(
ProgressResponse,
'POST',
'/api/create',
json=CreateRequest(
model=model,
stream=stream,
quantize=quantize,
from_=from_,
files=files,
adapters=adapters,
license=license,
template=template,
system=system,
parameters=parameters,
messages=messages,
).model_dump(exclude_none=True),
stream=stream,
)
def create_blob(self, path: Union[str, Path]) -> str:
sha256sum = sha256()
with open(path, 'rb') as r:
while True:
chunk = r.read(32 * 1024)
if not chunk:
break
sha256sum.update(chunk)
digest = f'sha256:{sha256sum.hexdigest()}'
with open(path, 'rb') as r:
self._request_raw('POST', f'/api/blobs/{digest}', content=r)
return digest
def list(self) -> ListResponse:
return self._request(
ListResponse,
'GET',
'/api/tags',
)
def delete(self, model: str) -> StatusResponse:
r = self._request_raw(
'DELETE',
'/api/delete',
json=DeleteRequest(
model=model,
).model_dump(exclude_none=True),
)
return StatusResponse(
status='success' if r.status_code == 200 else 'error',
)
def copy(self, source: str, destination: str) -> StatusResponse:
r = self._request_raw(
'POST',
'/api/copy',
json=CopyRequest(
source=source,
destination=destination,
).model_dump(exclude_none=True),
)
return StatusResponse(
status='success' if r.status_code == 200 else 'error',
)
def show(self, model: str) -> ShowResponse:
return self._request(
ShowResponse,
'POST',
'/api/show',
json=ShowRequest(
model=model,
).model_dump(exclude_none=True),
)
def ps(self) -> ProcessResponse:
return self._request(
ProcessResponse,
'GET',
'/api/ps',
)
def web_search(self, query: str, max_results: int = 3) -> WebSearchResponse:
"""
Performs a web search
Args:
query: The query to search for
max_results: The maximum number of results to return (default: 3)
Returns:
WebSearchResponse with the search results
Raises:
ValueError: If OLLAMA_API_KEY environment variable is not set
"""
if not self._client.headers.get('authorization', '').startswith('Bearer '):
raise ValueError('Authorization header with Bearer token is required for web search')
return self._request(
WebSearchResponse,
'POST',
'https://ollama.com/api/web_search',
json=WebSearchRequest(
query=query,
max_results=max_results,
).model_dump(exclude_none=True),
)
def web_fetch(self, url: str) -> WebFetchResponse:
"""
Fetches the content of a web page for the provided URL.
Args:
url: The URL to fetch
Returns:
WebFetchResponse with the fetched result
"""
if not self._client.headers.get('authorization', '').startswith('Bearer '):
raise ValueError('Authorization header with Bearer token is required for web fetch')
return self._request(
WebFetchResponse,
'POST',
'https://ollama.com/api/web_fetch',
json=WebFetchRequest(
url=url,
).model_dump(exclude_none=True),
)
class AsyncClient(BaseClient):
def __init__(self, host: Optional[str] = None, **kwargs) -> None:
super().__init__(httpx.AsyncClient, host, **kwargs)
async def close(self):
await self._client.aclose()
async def _request_raw(self, *args, **kwargs):
try:
r = await self._client.request(*args, **kwargs)
r.raise_for_status()
return r
except httpx.HTTPStatusError as e:
raise ResponseError(e.response.text, e.response.status_code) from None
except httpx.ConnectError:
raise ConnectionError(CONNECTION_ERROR_MESSAGE) from None
@overload
async def _request(
self,
cls: Type[T],
*args,
stream: Literal[False] = False,
**kwargs,
) -> T: ...
@overload
async def _request(
self,
cls: Type[T],
*args,
stream: Literal[True] = True,
**kwargs,
) -> AsyncIterator[T]: ...
@overload
async def _request(
self,
cls: Type[T],
*args,
stream: bool = False,
**kwargs,
) -> Union[T, AsyncIterator[T]]: ...
async def _request(
self,
cls: Type[T],
*args,
stream: bool = False,
**kwargs,
) -> Union[T, AsyncIterator[T]]:
if stream:
async def inner():
async with self._client.stream(*args, **kwargs) as r:
try:
r.raise_for_status()
except httpx.HTTPStatusError as e:
await e.response.aread()
raise ResponseError(e.response.text, e.response.status_code) from None
async for line in r.aiter_lines():
part = json.loads(line)
if err := part.get('error'):
raise ResponseError(err)
yield cls(**part)
return inner()
return cls(**(await self._request_raw(*args, **kwargs)).json())
async def web_search(self, query: str, max_results: int = 3) -> WebSearchResponse:
"""
Performs a web search
Args:
query: The query to search for
max_results: The maximum number of results to return (default: 3)
Returns:
WebSearchResponse with the search results
"""
return await self._request(
WebSearchResponse,
'POST',
'https://ollama.com/api/web_search',
json=WebSearchRequest(
query=query,
max_results=max_results,
).model_dump(exclude_none=True),
)
async def web_fetch(self, url: str) -> WebFetchResponse:
"""
Fetches the content of a web page for the provided URL.
Args:
url: The URL to fetch
Returns:
WebFetchResponse with the fetched result
"""
return await self._request(
WebFetchResponse,
'POST',
'https://ollama.com/api/web_fetch',
json=WebFetchRequest(
url=url,
).model_dump(exclude_none=True),
)
@overload
async def generate(
self,
model: str = '',
prompt: str = '',
suffix: str = '',
*,
system: str = '',
template: str = '',
context: Optional[Sequence[int]] = None,
stream: Literal[False] = False,
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
raw: bool = False,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
images: Optional[Sequence[Union[str, bytes, Image]]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
width: Optional[int] = None,
height: Optional[int] = None,
steps: Optional[int] = None,
) -> GenerateResponse: ...
@overload
async def generate(
self,
model: str = '',
prompt: str = '',
suffix: str = '',
*,
system: str = '',
template: str = '',
context: Optional[Sequence[int]] = None,
stream: Literal[True] = True,
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
raw: bool = False,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
images: Optional[Sequence[Union[str, bytes, Image]]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
width: Optional[int] = None,
height: Optional[int] = None,
steps: Optional[int] = None,
) -> AsyncIterator[GenerateResponse]: ...
async def generate(
self,
model: str = '',
prompt: Optional[str] = None,
suffix: Optional[str] = None,
*,
system: Optional[str] = None,
template: Optional[str] = None,
context: Optional[Sequence[int]] = None,
stream: bool = False,
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
raw: Optional[bool] = None,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
images: Optional[Sequence[Union[str, bytes, Image]]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
width: Optional[int] = None,
height: Optional[int] = None,
steps: Optional[int] = None,
) -> Union[GenerateResponse, AsyncIterator[GenerateResponse]]:
"""
Create a response using the requested model.
Raises `RequestError` if a model is not provided.
Raises `ResponseError` if the request could not be fulfilled.
Returns `GenerateResponse` if `stream` is `False`, otherwise returns an asynchronous `GenerateResponse` generator.
"""
return await self._request(
GenerateResponse,
'POST',
'/api/generate',
json=GenerateRequest(
model=model,
prompt=prompt,
suffix=suffix,
system=system,
template=template,
context=context,
stream=stream,
think=think,
logprobs=logprobs,
top_logprobs=top_logprobs,
raw=raw,
format=format,
images=list(_copy_images(images)) if images else None,
options=options,
keep_alive=keep_alive,
width=width,
height=height,
steps=steps,
).model_dump(exclude_none=True),
stream=stream,
)
@overload
async def chat(
self,
model: str = '',
messages: Optional[Sequence[Union[Mapping[str, Any], Message]]] = None,
*,
tools: Optional[Sequence[Union[Mapping[str, Any], Tool, Callable]]] = None,
stream: Literal[False] = False,
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
) -> ChatResponse: ...
@overload
async def chat(
self,
model: str = '',
messages: Optional[Sequence[Union[Mapping[str, Any], Message]]] = None,
*,
tools: Optional[Sequence[Union[Mapping[str, Any], Tool, Callable]]] = None,
stream: Literal[True] = True,
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
) -> AsyncIterator[ChatResponse]: ...
async def chat(
self,
model: str = '',
messages: Optional[Sequence[Union[Mapping[str, Any], Message]]] = None,
*,
tools: Optional[Sequence[Union[Mapping[str, Any], Tool, Callable]]] = None,
stream: bool = False,
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
) -> Union[ChatResponse, AsyncIterator[ChatResponse]]:
"""
Create a chat response using the requested model.
Args:
tools:
A JSON schema as a dict, an Ollama Tool or a Python Function.
Python functions need to follow Google style docstrings to be converted to an Ollama Tool.
For more information, see: https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings
stream: Whether to stream the response.
format: The format of the response.
Example:
def add_two_numbers(a: int, b: int) -> int:
'''
Add two numbers together.
Args:
a: First number to add
b: Second number to add
Returns:
int: The sum of a and b
'''
return a + b
await client.chat(model='llama3.2', tools=[add_two_numbers], messages=[...])
Raises `RequestError` if a model is not provided.
Raises `ResponseError` if the request could not be fulfilled.
Returns `ChatResponse` if `stream` is `False`, otherwise returns an asynchronous `ChatResponse` generator.
"""
return await self._request(
ChatResponse,
'POST',
'/api/chat',
json=ChatRequest(
model=model,
messages=list(_copy_messages(messages)),
tools=list(_copy_tools(tools)),
stream=stream,
think=think,
logprobs=logprobs,
top_logprobs=top_logprobs,
format=format,
options=options,
keep_alive=keep_alive,
).model_dump(exclude_none=True),
stream=stream,
)
async def embed(
self,
model: str = '',
input: Union[str, Sequence[str]] = '',
truncate: Optional[bool] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
dimensions: Optional[int] = None,
) -> EmbedResponse:
return await self._request(
EmbedResponse,
'POST',
'/api/embed',
json=EmbedRequest(
model=model,
input=input,
truncate=truncate,
options=options,
keep_alive=keep_alive,
dimensions=dimensions,
).model_dump(exclude_none=True),
)
async def embeddings(
self,
model: str = '',
prompt: Optional[str] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
) -> EmbeddingsResponse:
"""
Deprecated in favor of `embed`.
"""
return await self._request(
EmbeddingsResponse,
'POST',
'/api/embeddings',
json=EmbeddingsRequest(
model=model,
prompt=prompt,
options=options,
keep_alive=keep_alive,
).model_dump(exclude_none=True),
)
@overload
async def pull(
self,
model: str,
*,
insecure: bool = False,
stream: Literal[False] = False,
) -> ProgressResponse: ...
@overload
async def pull(
self,
model: str,
*,
insecure: bool = False,
stream: Literal[True] = True,
) -> AsyncIterator[ProgressResponse]: ...
async def pull(
self,
model: str,
*,
insecure: bool = False,
stream: bool = False,
) -> Union[ProgressResponse, AsyncIterator[ProgressResponse]]:
"""
Raises `ResponseError` if the request could not be fulfilled.
Returns `ProgressResponse` if `stream` is `False`, otherwise returns a `ProgressResponse` generator.
"""
return await self._request(
ProgressResponse,
'POST',
'/api/pull',
json=PullRequest(
model=model,
insecure=insecure,
stream=stream,
).model_dump(exclude_none=True),
stream=stream,
)
@overload
async def push(
self,
model: str,
*,
insecure: bool = False,
stream: Literal[False] = False,
) -> ProgressResponse: ...
@overload
async def push(
self,
model: str,
*,
insecure: bool = False,
stream: Literal[True] = True,
) -> AsyncIterator[ProgressResponse]: ...
async def push(
self,
model: str,
*,
insecure: bool = False,
stream: bool = False,
) -> Union[ProgressResponse, AsyncIterator[ProgressResponse]]:
"""
Raises `ResponseError` if the request could not be fulfilled.
Returns `ProgressResponse` if `stream` is `False`, otherwise returns a `ProgressResponse` generator.
"""
return await self._request(
ProgressResponse,
'POST',
'/api/push',
json=PushRequest(
model=model,
insecure=insecure,
stream=stream,
).model_dump(exclude_none=True),
stream=stream,
)
@overload
async def create(
self,
model: str,
quantize: Optional[str] = None,
from_: Optional[str] = None,
files: Optional[Dict[str, str]] = None,
adapters: Optional[Dict[str, str]] = None,
template: Optional[str] = None,
license: Optional[Union[str, List[str]]] = None,
system: Optional[str] = None,
parameters: Optional[Union[Mapping[str, Any], Options]] = None,
messages: Optional[Sequence[Union[Mapping[str, Any], Message]]] = None,
*,
stream: Literal[False] = False,
) -> ProgressResponse: ...
@overload
async def create(
self,
model: str,
quantize: Optional[str] = None,
from_: Optional[str] = None,
files: Optional[Dict[str, str]] = None,
adapters: Optional[Dict[str, str]] = None,
template: Optional[str] = None,
license: Optional[Union[str, List[str]]] = None,
system: Optional[str] = None,
parameters: Optional[Union[Mapping[str, Any], Options]] = None,
messages: Optional[Sequence[Union[Mapping[str, Any], Message]]] = None,
*,
stream: Literal[True] = True,
) -> AsyncIterator[ProgressResponse]: ...
async def create(
self,
model: str,
quantize: Optional[str] = None,
from_: Optional[str] = None,
files: Optional[Dict[str, str]] = None,
adapters: Optional[Dict[str, str]] = None,
template: Optional[str] = None,
license: Optional[Union[str, List[str]]] = None,
system: Optional[str] = None,
parameters: Optional[Union[Mapping[str, Any], Options]] = None,
messages: Optional[Sequence[Union[Mapping[str, Any], Message]]] = None,
*,
stream: bool = False,
) -> Union[ProgressResponse, AsyncIterator[ProgressResponse]]:
"""
Raises `ResponseError` if the request could not be fulfilled.
Returns `ProgressResponse` if `stream` is `False`, otherwise returns a `ProgressResponse` generator.
"""
return await self._request(
ProgressResponse,
'POST',
'/api/create',
json=CreateRequest(
model=model,
stream=stream,
quantize=quantize,
from_=from_,
files=files,
adapters=adapters,
license=license,
template=template,
system=system,
parameters=parameters,
messages=messages,
).model_dump(exclude_none=True),
stream=stream,
)
async def create_blob(self, path: Union[str, Path]) -> str:
sha256sum = sha256()
async with await anyio.open_file(path, 'rb') as r:
while True:
chunk = await r.read(32 * 1024)
if not chunk:
break
sha256sum.update(chunk)
digest = f'sha256:{sha256sum.hexdigest()}'
async def upload_bytes():
async with await anyio.open_file(path, 'rb') as r:
while True:
chunk = await r.read(32 * 1024)
if not chunk:
break
yield chunk
await self._request_raw('POST', f'/api/blobs/{digest}', content=upload_bytes())
return digest
async def list(self) -> ListResponse:
return await self._request(
ListResponse,
'GET',
'/api/tags',
)
async def delete(self, model: str) -> StatusResponse:
r = await self._request_raw(
'DELETE',
'/api/delete',
json=DeleteRequest(
model=model,
).model_dump(exclude_none=True),
)
return StatusResponse(
status='success' if r.status_code == 200 else 'error',
)
async def copy(self, source: str, destination: str) -> StatusResponse:
r = await self._request_raw(
'POST',
'/api/copy',
json=CopyRequest(
source=source,
destination=destination,
).model_dump(exclude_none=True),
)
return StatusResponse(
status='success' if r.status_code == 200 else 'error',
)
async def show(self, model: str) -> ShowResponse:
return await self._request(
ShowResponse,
'POST',
'/api/show',
json=ShowRequest(
model=model,
).model_dump(exclude_none=True),
)
async def ps(self) -> ProcessResponse:
return await self._request(
ProcessResponse,
'GET',
'/api/ps',
)
def _copy_images(images: Optional[Sequence[Union[Image, Any]]]) -> Iterator[Image]:
for image in images or []:
yield image if isinstance(image, Image) else Image(value=image)
def _copy_messages(messages: Optional[Sequence[Union[Mapping[str, Any], Message]]]) -> Iterator[Message]:
for message in messages or []:
yield Message.model_validate(
{k: list(_copy_images(v)) if k == 'images' else v for k, v in dict(message).items() if v},
)
def _copy_tools(tools: Optional[Sequence[Union[Mapping[str, Any], Tool, Callable]]] = None) -> Iterator[Tool]:
for unprocessed_tool in tools or []:
yield convert_function_to_tool(unprocessed_tool) if callable(unprocessed_tool) else Tool.model_validate(unprocessed_tool)
def _as_path(s: Optional[Union[str, PathLike]]) -> Union[Path, None]:
if isinstance(s, (str, Path)):
try:
if (p := Path(s)).exists():
return p
except Exception:
...
return None
def _parse_host(host: Optional[str]) -> str:
"""
>>> _parse_host(None)
'http://127.0.0.1:11434'
>>> _parse_host('')
'http://127.0.0.1:11434'
>>> _parse_host('1.2.3.4')
'http://1.2.3.4:11434'
>>> _parse_host(':56789')
'http://127.0.0.1:56789'
>>> _parse_host('1.2.3.4:56789')
'http://1.2.3.4:56789'
>>> _parse_host('http://1.2.3.4')
'http://1.2.3.4:80'
>>> _parse_host('https://1.2.3.4')
'https://1.2.3.4:443'
>>> _parse_host('https://1.2.3.4:56789')
'https://1.2.3.4:56789'
>>> _parse_host('example.com')
'http://example.com:11434'
>>> _parse_host('example.com:56789')
'http://example.com:56789'
>>> _parse_host('http://example.com')
'http://example.com:80'
>>> _parse_host('https://example.com')
'https://example.com:443'
>>> _parse_host('https://example.com:56789')
'https://example.com:56789'
>>> _parse_host('example.com/')
'http://example.com:11434'
>>> _parse_host('example.com:56789/')
'http://example.com:56789'
>>> _parse_host('example.com/path')
'http://example.com:11434/path'
>>> _parse_host('example.com:56789/path')
'http://example.com:56789/path'
>>> _parse_host('https://example.com:56789/path')
'https://example.com:56789/path'
>>> _parse_host('example.com:56789/path/')
'http://example.com:56789/path'
>>> _parse_host('[0001:002:003:0004::1]')
'http://[0001:002:003:0004::1]:11434'
>>> _parse_host('[0001:002:003:0004::1]:56789')
'http://[0001:002:003:0004::1]:56789'
>>> _parse_host('http://[0001:002:003:0004::1]')
'http://[0001:002:003:0004::1]:80'
>>> _parse_host('https://[0001:002:003:0004::1]')
'https://[0001:002:003:0004::1]:443'
>>> _parse_host('https://[0001:002:003:0004::1]:56789')
'https://[0001:002:003:0004::1]:56789'
>>> _parse_host('[0001:002:003:0004::1]/')
'http://[0001:002:003:0004::1]:11434'
>>> _parse_host('[0001:002:003:0004::1]:56789/')
'http://[0001:002:003:0004::1]:56789'
>>> _parse_host('[0001:002:003:0004::1]/path')
'http://[0001:002:003:0004::1]:11434/path'
>>> _parse_host('[0001:002:003:0004::1]:56789/path')
'http://[0001:002:003:0004::1]:56789/path'
>>> _parse_host('https://[0001:002:003:0004::1]:56789/path')
'https://[0001:002:003:0004::1]:56789/path'
>>> _parse_host('[0001:002:003:0004::1]:56789/path/')
'http://[0001:002:003:0004::1]:56789/path'
"""
host, port = host or '', 11434
scheme, _, hostport = host.partition('://')
if not hostport:
scheme, hostport = 'http', host
elif scheme == 'http':
port = 80
elif scheme == 'https':
port = 443
split = urllib.parse.urlsplit(f'{scheme}://{hostport}')
host = split.hostname or '127.0.0.1'
port = split.port or port
try:
if isinstance(ipaddress.ip_address(host), ipaddress.IPv6Address):
# Fix missing square brackets for IPv6 from urlsplit
host = f'[{host}]'
except ValueError:
...
if path := split.path.strip('/'):
return f'{scheme}://{host}:{port}/{path}'
return f'{scheme}://{host}:{port}'
================================================
FILE: ollama/_types.py
================================================
import contextlib
import json
from base64 import b64decode, b64encode
from datetime import datetime
from pathlib import Path
from typing import Any, Dict, List, Mapping, Optional, Sequence, Union
from pydantic import (
BaseModel,
ByteSize,
ConfigDict,
Field,
model_serializer,
)
from pydantic.json_schema import JsonSchemaValue
from typing_extensions import Annotated, Literal
class SubscriptableBaseModel(BaseModel):
def __getitem__(self, key: str) -> Any:
"""
>>> msg = Message(role='user')
>>> msg['role']
'user'
>>> msg = Message(role='user')
>>> msg['nonexistent']
Traceback (most recent call last):
KeyError: 'nonexistent'
"""
if key in self:
return getattr(self, key)
raise KeyError(key)
def __setitem__(self, key: str, value: Any) -> None:
"""
>>> msg = Message(role='user')
>>> msg['role'] = 'assistant'
>>> msg['role']
'assistant'
>>> tool_call = Message.ToolCall(function=Message.ToolCall.Function(name='foo', arguments={}))
>>> msg = Message(role='user', content='hello')
>>> msg['tool_calls'] = [tool_call]
>>> msg['tool_calls'][0]['function']['name']
'foo'
"""
setattr(self, key, value)
def __contains__(self, key: str) -> bool:
"""
>>> msg = Message(role='user')
>>> 'nonexistent' in msg
False
>>> 'role' in msg
True
>>> 'content' in msg
False
>>> msg.content = 'hello!'
>>> 'content' in msg
True
>>> msg = Message(role='user', content='hello!')
>>> 'content' in msg
True
>>> 'tool_calls' in msg
False
>>> msg['tool_calls'] = []
>>> 'tool_calls' in msg
True
>>> msg['tool_calls'] = [Message.ToolCall(function=Message.ToolCall.Function(name='foo', arguments={}))]
>>> 'tool_calls' in msg
True
>>> msg['tool_calls'] = None
>>> 'tool_calls' in msg
True
>>> tool = Tool()
>>> 'type' in tool
True
"""
if key in self.model_fields_set:
return True
if value := self.__class__.model_fields.get(key):
return value.default is not None
return False
def get(self, key: str, default: Any = None) -> Any:
"""
>>> msg = Message(role='user')
>>> msg.get('role')
'user'
>>> msg = Message(role='user')
>>> msg.get('nonexistent')
>>> msg = Message(role='user')
>>> msg.get('nonexistent', 'default')
'default'
>>> msg = Message(role='user', tool_calls=[ Message.ToolCall(function=Message.ToolCall.Function(name='foo', arguments={}))])
>>> msg.get('tool_calls')[0]['function']['name']
'foo'
"""
return getattr(self, key) if hasattr(self, key) else default
class Options(SubscriptableBaseModel):
# load time options
numa: Optional[bool] = None
num_ctx: Optional[int] = None
num_batch: Optional[int] = None
num_gpu: Optional[int] = None
main_gpu: Optional[int] = None
low_vram: Optional[bool] = None
f16_kv: Optional[bool] = None
logits_all: Optional[bool] = None
vocab_only: Optional[bool] = None
use_mmap: Optional[bool] = None
use_mlock: Optional[bool] = None
embedding_only: Optional[bool] = None
num_thread: Optional[int] = None
# runtime options
num_keep: Optional[int] = None
seed: Optional[int] = None
num_predict: Optional[int] = None
top_k: Optional[int] = None
top_p: Optional[float] = None
tfs_z: Optional[float] = None
typical_p: Optional[float] = None
repeat_last_n: Optional[int] = None
temperature: Optional[float] = None
repeat_penalty: Optional[float] = None
presence_penalty: Optional[float] = None
frequency_penalty: Optional[float] = None
mirostat: Optional[int] = None
mirostat_tau: Optional[float] = None
mirostat_eta: Optional[float] = None
penalize_newline: Optional[bool] = None
stop: Optional[Sequence[str]] = None
class BaseRequest(SubscriptableBaseModel):
model: Annotated[str, Field(min_length=1)]
'Model to use for the request.'
class BaseStreamableRequest(BaseRequest):
stream: Optional[bool] = None
'Stream response.'
class BaseGenerateRequest(BaseStreamableRequest):
options: Optional[Union[Mapping[str, Any], Options]] = None
'Options to use for the request.'
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None
'Format of the response.'
keep_alive: Optional[Union[float, str]] = None
'Keep model alive for the specified duration.'
class Image(BaseModel):
value: Union[str, bytes, Path]
@model_serializer
def serialize_model(self):
if isinstance(self.value, (Path, bytes)):
return b64encode(self.value.read_bytes() if isinstance(self.value, Path) else self.value).decode()
if isinstance(self.value, str):
try:
if Path(self.value).exists():
return b64encode(Path(self.value).read_bytes()).decode()
except Exception:
# Long base64 string can't be wrapped in Path, so try to treat as base64 string
pass
# String might be a file path, but might not exist
if self.value.split('.')[-1] in ('png', 'jpg', 'jpeg', 'webp'):
raise ValueError(f'File {self.value} does not exist')
try:
# Try to decode to check if it's already base64
b64decode(self.value)
return self.value
except Exception:
raise ValueError('Invalid image data, expected base64 string or path to image file') from Exception
class GenerateRequest(BaseGenerateRequest):
prompt: Optional[str] = None
'Prompt to generate response from.'
suffix: Optional[str] = None
'Suffix to append to the response.'
system: Optional[str] = None
'System prompt to prepend to the prompt.'
template: Optional[str] = None
'Template to use for the response.'
context: Optional[Sequence[int]] = None
'Tokenized history to use for the response.'
raw: Optional[bool] = None
images: Optional[Sequence[Image]] = None
'Image data for multimodal models.'
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None
'Enable thinking mode (for thinking models).'
logprobs: Optional[bool] = None
'Return log probabilities for generated tokens.'
top_logprobs: Optional[int] = None
'Number of alternative tokens and log probabilities to include per position (0-20).'
# Experimental image generation parameters
width: Optional[int] = None
'Width of the generated image in pixels (for image generation models).'
height: Optional[int] = None
'Height of the generated image in pixels (for image generation models).'
steps: Optional[int] = None
'Number of diffusion steps (for image generation models).'
class BaseGenerateResponse(SubscriptableBaseModel):
model: Optional[str] = None
'Model used to generate response.'
created_at: Optional[str] = None
'Time when the request was created.'
done: Optional[bool] = None
'True if response is complete, otherwise False. Useful for streaming to detect the final response.'
done_reason: Optional[str] = None
'Reason for completion. Only present when done is True.'
total_duration: Optional[int] = None
'Total duration in nanoseconds.'
load_duration: Optional[int] = None
'Load duration in nanoseconds.'
prompt_eval_count: Optional[int] = None
'Number of tokens evaluated in the prompt.'
prompt_eval_duration: Optional[int] = None
'Duration of evaluating the prompt in nanoseconds.'
eval_count: Optional[int] = None
'Number of tokens evaluated in inference.'
eval_duration: Optional[int] = None
'Duration of evaluating inference in nanoseconds.'
class TokenLogprob(SubscriptableBaseModel):
token: str
'Token text.'
logprob: float
'Log probability for the token.'
class Logprob(TokenLogprob):
top_logprobs: Optional[Sequence[TokenLogprob]] = None
'Most likely tokens and their log probabilities.'
class GenerateResponse(BaseGenerateResponse):
"""
Response returned by generate requests.
"""
response: Optional[str] = None
'Response content. When streaming, this contains a fragment of the response.'
thinking: Optional[str] = None
'Thinking content. Only present when thinking is enabled.'
context: Optional[Sequence[int]] = None
'Tokenized history up to the point of the response.'
logprobs: Optional[Sequence[Logprob]] = None
'Log probabilities for generated tokens.'
# Image generation response fields
image: Optional[str] = None
'Base64-encoded generated image data (for image generation models).'
# Streaming progress fields (for image generation)
completed: Optional[int] = None
'Number of completed steps (for image generation streaming).'
total: Optional[int] = None
'Total number of steps (for image generation streaming).'
class Message(SubscriptableBaseModel):
"""
Chat message.
"""
role: str
"Assumed role of the message. Response messages has role 'assistant' or 'tool'."
content: Optional[str] = None
'Content of the message. Response messages contains message fragments when streaming.'
thinking: Optional[str] = None
'Thinking content. Only present when thinking is enabled.'
images: Optional[Sequence[Image]] = None
"""
Optional list of image data for multimodal models.
Valid input types are:
- `str` or path-like object: path to image file
- `bytes` or bytes-like object: raw image data
Valid image formats depend on the model. See the model card for more information.
"""
tool_name: Optional[str] = None
'Name of the executed tool.'
class ToolCall(SubscriptableBaseModel):
"""
Model tool calls.
"""
class Function(SubscriptableBaseModel):
"""
Tool call function.
"""
name: str
'Name of the function.'
arguments: Mapping[str, Any]
'Arguments of the function.'
function: Function
'Function to be called.'
tool_calls: Optional[Sequence[ToolCall]] = None
"""
Tools calls to be made by the model.
"""
class Tool(SubscriptableBaseModel):
type: Optional[str] = 'function'
class Function(SubscriptableBaseModel):
name: Optional[str] = None
description: Optional[str] = None
class Parameters(SubscriptableBaseModel):
model_config = ConfigDict(populate_by_name=True)
type: Optional[Literal['object']] = 'object'
defs: Optional[Any] = Field(None, alias='$defs')
items: Optional[Any] = None
required: Optional[Sequence[str]] = None
class Property(SubscriptableBaseModel):
model_config = ConfigDict(arbitrary_types_allowed=True)
type: Optional[Union[str, Sequence[str]]] = None
items: Optional[Any] = None
description: Optional[str] = None
enum: Optional[Sequence[Any]] = None
properties: Optional[Mapping[str, Property]] = None
parameters: Optional[Parameters] = None
function: Optional[Function] = None
class ChatRequest(BaseGenerateRequest):
@model_serializer(mode='wrap')
def serialize_model(self, nxt):
output = nxt(self)
if output.get('tools'):
for tool in output['tools']:
if 'function' in tool and 'parameters' in tool['function'] and 'defs' in tool['function']['parameters']:
tool['function']['parameters']['$defs'] = tool['function']['parameters'].pop('defs')
return output
messages: Optional[Sequence[Union[Mapping[str, Any], Message]]] = None
'Messages to chat with.'
tools: Optional[Sequence[Tool]] = None
'Tools to use for the chat.'
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None
'Enable thinking mode (for thinking models).'
logprobs: Optional[bool] = None
'Return log probabilities for generated tokens.'
top_logprobs: Optional[int] = None
'Number of alternative tokens and log probabilities to include per position (0-20).'
class ChatResponse(BaseGenerateResponse):
"""
Response returned by chat requests.
"""
message: Message
'Response message.'
logprobs: Optional[Sequence[Logprob]] = None
'Log probabilities for generated tokens if requested.'
class EmbedRequest(BaseRequest):
input: Union[str, Sequence[str]]
'Input text to embed.'
truncate: Optional[bool] = None
'Truncate the input to the maximum token length.'
options: Optional[Union[Mapping[str, Any], Options]] = None
'Options to use for the request.'
keep_alive: Optional[Union[float, str]] = None
dimensions: Optional[int] = None
'Dimensions truncates the output embedding to the specified dimension.'
class EmbedResponse(BaseGenerateResponse):
"""
Response returned by embed requests.
"""
embeddings: Sequence[Sequence[float]]
'Embeddings of the inputs.'
class EmbeddingsRequest(BaseRequest):
prompt: Optional[str] = None
'Prompt to generate embeddings from.'
options: Optional[Union[Mapping[str, Any], Options]] = None
'Options to use for the request.'
keep_alive: Optional[Union[float, str]] = None
class EmbeddingsResponse(SubscriptableBaseModel):
"""
Response returned by embeddings requests.
"""
embedding: Sequence[float]
'Embedding of the prompt.'
class PullRequest(BaseStreamableRequest):
"""
Request to pull the model.
"""
insecure: Optional[bool] = None
'Allow insecure (HTTP) connections.'
class PushRequest(BaseStreamableRequest):
"""
Request to pull the model.
"""
insecure: Optional[bool] = None
'Allow insecure (HTTP) connections.'
class CreateRequest(BaseStreamableRequest):
@model_serializer(mode='wrap')
def serialize_model(self, nxt):
output = nxt(self)
if 'from_' in output:
output['from'] = output.pop('from_')
return output
"""
Request to create a new model.
"""
quantize: Optional[str] = None
from_: Optional[str] = None
files: Optional[Dict[str, str]] = None
adapters: Optional[Dict[str, str]] = None
template: Optional[str] = None
license: Optional[Union[str, List[str]]] = None
system: Optional[str] = None
parameters: Optional[Union[Mapping[str, Any], Options]] = None
messages: Optional[Sequence[Union[Mapping[str, Any], Message]]] = None
class ModelDetails(SubscriptableBaseModel):
parent_model: Optional[str] = None
format: Optional[str] = None
family: Optional[str] = None
families: Optional[Sequence[str]] = None
parameter_size: Optional[str] = None
quantization_level: Optional[str] = None
class ListResponse(SubscriptableBaseModel):
class Model(SubscriptableBaseModel):
model: Optional[str] = None
modified_at: Optional[datetime] = None
digest: Optional[str] = None
size: Optional[ByteSize] = None
details: Optional[ModelDetails] = None
models: Sequence[Model]
'List of models.'
class DeleteRequest(BaseRequest):
"""
Request to delete a model.
"""
class CopyRequest(BaseModel):
"""
Request to copy a model.
"""
source: str
'Source model to copy.'
destination: str
'Destination model to copy to.'
class StatusResponse(SubscriptableBaseModel):
status: Optional[str] = None
class ProgressResponse(StatusResponse):
completed: Optional[int] = None
total: Optional[int] = None
digest: Optional[str] = None
class ShowRequest(BaseRequest):
"""
Request to show model information.
"""
class ShowResponse(SubscriptableBaseModel):
modified_at: Optional[datetime] = None
template: Optional[str] = None
modelfile: Optional[str] = None
license: Optional[str] = None
details: Optional[ModelDetails] = None
modelinfo: Optional[Mapping[str, Any]] = Field(alias='model_info')
parameters: Optional[str] = None
capabilities: Optional[List[str]] = None
class ProcessResponse(SubscriptableBaseModel):
class Model(SubscriptableBaseModel):
model: Optional[str] = None
name: Optional[str] = None
digest: Optional[str] = None
expires_at: Optional[datetime] = None
size: Optional[ByteSize] = None
size_vram: Optional[ByteSize] = None
details: Optional[ModelDetails] = None
context_length: Optional[int] = None
models: Sequence[Model]
class WebSearchRequest(SubscriptableBaseModel):
query: str
max_results: Optional[int] = None
class WebSearchResult(SubscriptableBaseModel):
content: Optional[str] = None
title: Optional[str] = None
url: Optional[str] = None
class WebFetchRequest(SubscriptableBaseModel):
url: str
class WebSearchResponse(SubscriptableBaseModel):
results: Sequence[WebSearchResult]
class WebFetchResponse(SubscriptableBaseModel):
title: Optional[str] = None
content: Optional[str] = None
links: Optional[Sequence[str]] = None
class RequestError(Exception):
"""
Common class for request errors.
"""
def __init__(self, error: str):
super().__init__(error)
self.error = error
'Reason for the error.'
class ResponseError(Exception):
"""
Common class for response errors.
"""
def __init__(self, error: str, status_code: int = -1):
# try to parse content as JSON and extract 'error'
# fallback to raw content if JSON parsing fails
with contextlib.suppress(json.JSONDecodeError):
error = json.loads(error).get('error', error)
super().__init__(error)
self.error = error
'Reason for the error.'
self.status_code = status_code
'HTTP status code of the response.'
def __str__(self) -> str:
return f'{self.error} (status code: {self.status_code})'
================================================
FILE: ollama/_utils.py
================================================
from __future__ import annotations
import inspect
import re
from collections import defaultdict
from typing import Callable, Union
import pydantic
from ollama._types import Tool
def _parse_docstring(doc_string: Union[str, None]) -> dict[str, str]:
parsed_docstring = defaultdict(str)
if not doc_string:
return parsed_docstring
key = str(hash(doc_string))
for line in doc_string.splitlines():
lowered_line = line.lower().strip()
if lowered_line.startswith('args:'):
key = 'args'
elif lowered_line.startswith(('returns:', 'yields:', 'raises:')):
key = '_'
else:
# maybe change to a list and join later
parsed_docstring[key] += f'{line.strip()}\n'
last_key = None
for line in parsed_docstring['args'].splitlines():
line = line.strip()
if ':' in line:
# Split the line on either:
# 1. A parenthetical expression like (integer) - captured in group 1
# 2. A colon :
# Followed by optional whitespace. Only split on first occurrence.
parts = re.split(r'(?:\(([^)]*)\)|:)\s*', line, maxsplit=1)
arg_name = parts[0].strip()
last_key = arg_name
# Get the description - will be in parts[1] if parenthetical or parts[-1] if after colon
arg_description = parts[-1].strip()
if len(parts) > 2 and parts[1]: # Has parenthetical content
arg_description = parts[-1].split(':', 1)[-1].strip()
parsed_docstring[last_key] = arg_description
elif last_key and line:
parsed_docstring[last_key] += ' ' + line
return parsed_docstring
def convert_function_to_tool(func: Callable) -> Tool:
doc_string_hash = str(hash(inspect.getdoc(func)))
parsed_docstring = _parse_docstring(inspect.getdoc(func))
schema = type(
func.__name__,
(pydantic.BaseModel,),
{
'__annotations__': {k: v.annotation if v.annotation != inspect._empty else str for k, v in inspect.signature(func).parameters.items()},
'__signature__': inspect.signature(func),
'__doc__': parsed_docstring[doc_string_hash],
},
).model_json_schema()
for k, v in schema.get('properties', {}).items():
# If type is missing, the default is string
types = {t.get('type', 'string') for t in v.get('anyOf')} if 'anyOf' in v else {v.get('type', 'string')}
if 'null' in types:
schema['required'].remove(k)
types.discard('null')
schema['properties'][k] = {
'description': parsed_docstring[k],
'type': ', '.join(types),
}
tool = Tool(
type='function',
function=Tool.Function(
name=func.__name__,
description=schema.get('description', ''),
parameters=Tool.Function.Parameters(**schema),
),
)
return Tool.model_validate(tool)
================================================
FILE: ollama/py.typed
================================================
================================================
FILE: pyproject.toml
================================================
[project]
name = 'ollama'
description = 'The official Python client for Ollama.'
authors = [
{ email = 'hello@ollama.com' },
]
readme = 'README.md'
requires-python = '>=3.8'
dependencies = [
'httpx>=0.27',
'pydantic>=2.9',
]
dynamic = [ 'version' ]
license = "MIT"
[project.urls]
homepage = 'https://ollama.com'
repository = 'https://github.com/ollama/ollama-python'
issues = 'https://github.com/ollama/ollama-python/issues'
[build-system]
requires = [ 'hatchling', 'hatch-vcs' ]
build-backend = 'hatchling.build'
[tool.hatch.version]
source = 'vcs'
[tool.hatch.envs.hatch-test]
default-args = ['ollama', 'tests']
extra-dependencies = [
'pytest-anyio',
'pytest-httpserver',
]
[tool.hatch.envs.hatch-static-analysis]
dependencies = [ 'ruff>=0.9.1' ]
config-path = 'none'
[tool.ruff]
line-length = 320
indent-width = 2
[tool.ruff.format]
quote-style = 'single'
indent-style = 'space'
docstring-code-format = false
[tool.ruff.lint]
select = [
'F', # pyflakes
'E', # pycodestyle errors
'W', # pycodestyle warnings
'I', # sort imports
'N', # pep8-naming
'ASYNC', # flake8-async
'FBT', # flake8-boolean-trap
'B', # flake8-bugbear
'C4', # flake8-comprehensions
'PIE', # flake8-pie
'SIM', # flake8-simplify
'FLY', # flynt
'RUF', # ruff-specific rules
]
ignore = ['FBT001'] # Boolean-typed positional argument in function definition
[tool.pytest.ini_options]
addopts = ['--doctest-modules']
================================================
FILE: requirements.txt
================================================
# This file was autogenerated by uv via the following command:
# uv export
-e .
annotated-types==0.7.0 \
--hash=sha256:1f02e8b43a8fbbc3f3e0d4f0f4bfc8131bcb4eebe8849b8e5c773f3a1c582a53 \
--hash=sha256:aff07c09a53a08bc8cfccb9c85b05f1aa9a2a6f23728d790723543408344ce89
# via pydantic
anyio==4.5.2 ; python_full_version < '3.9' \
--hash=sha256:23009af4ed04ce05991845451e11ef02fc7c5ed29179ac9a420e5ad0ac7ddc5b \
--hash=sha256:c011ee36bc1e8ba40e5a81cb9df91925c218fe9b778554e0b56a21e1b5d4716f
# via httpx
anyio==4.8.0 ; python_full_version >= '3.9' \
--hash=sha256:1d9fe889df5212298c0c0723fa20479d1b94883a2df44bd3897aa91083316f7a \
--hash=sha256:b5011f270ab5eb0abf13385f851315585cc37ef330dd88e27ec3d34d651fd47a
# via httpx
certifi==2025.1.31 \
--hash=sha256:3d5da6925056f6f18f119200434a4780a94263f10d1c21d032a6f6b2baa20651 \
--hash=sha256:ca78db4565a652026a4db2bcdf68f2fb589ea80d0be70e03929ed730746b84fe
# via
# httpcore
# httpx
exceptiongroup==1.2.2 ; python_full_version < '3.11' \
--hash=sha256:3111b9d131c238bec2f8f516e123e14ba243563fb135d3fe885990585aa7795b \
--hash=sha256:47c2edf7c6738fafb49fd34290706d1a1a2f4d1c6df275526b62cbb4aa5393cc
# via anyio
h11==0.14.0 \
--hash=sha256:8f19fbbe99e72420ff35c00b27a34cb9937e902a8b810e2c88300c6f0a3b699d \
--hash=sha256:e3fe4ac4b851c468cc8363d500db52c2ead036020723024a109d37346efaa761
# via httpcore
httpcore==1.0.7 \
--hash=sha256:8551cb62a169ec7162ac7be8d4817d561f60e08eaa485234898414bb5a8a0b4c \
--hash=sha256:a3fff8f43dc260d5bd363d9f9cf1830fa3a458b332856f34282de498ed420edd
# via httpx
httpx==0.28.1 \
--hash=sha256:75e98c5f16b0f35b567856f597f06ff2270a374470a5c2392242528e3e3e42fc \
--hash=sha256:d909fcccc110f8c7faf814ca82a9a4d816bc5a6dbfea25d6591d6985b8ba59ad
# via ollama
idna==3.10 \
--hash=sha256:12f65c9b470abda6dc35cf8e63cc574b1c52b11df2c86030af0ac09b01b13ea9 \
--hash=sha256:946d195a0d259cbba61165e88e65941f16e9b36ea6ddb97f00452bae8b1287d3
# via
# anyio
# httpx
pydantic==2.10.6 \
--hash=sha256:427d664bf0b8a2b34ff5dd0f5a18df00591adcee7198fbd71981054cef37b584 \
--hash=sha256:ca5daa827cce33de7a42be142548b0096bf05a7e7b365aebfa5f8eeec7128236
# via ollama
pydantic-core==2.27.2 \
--hash=sha256:00bad2484fa6bda1e216e7345a798bd37c68fb2d97558edd584942aa41b7d278 \
--hash=sha256:0296abcb83a797db256b773f45773da397da75a08f5fcaef41f2044adec05f50 \
--hash=sha256:03d0f86ea3184a12f41a2d23f7ccb79cdb5a18e06993f8a45baa8dfec746f0e9 \
--hash=sha256:044a50963a614ecfae59bb1eaf7ea7efc4bc62f49ed594e18fa1e5d953c40e9f \
--hash=sha256:05e3a55d124407fffba0dd6b0c0cd056d10e983ceb4e5dbd10dda135c31071d6 \
--hash=sha256:08e125dbdc505fa69ca7d9c499639ab6407cfa909214d500897d02afb816e7cc \
--hash=sha256:097830ed52fd9e427942ff3b9bc17fab52913b2f50f2880dc4a5611446606a54 \
--hash=sha256:0d1e85068e818c73e048fe28cfc769040bb1f475524f4745a5dc621f75ac7630 \
--hash=sha256:0d75070718e369e452075a6017fbf187f788e17ed67a3abd47fa934d001863d9 \
--hash=sha256:14d4a5c49d2f009d62a2a7140d3064f686d17a5d1a268bc641954ba181880236 \
--hash=sha256:172fce187655fece0c90d90a678424b013f8fbb0ca8b036ac266749c09438cb7 \
--hash=sha256:18a101c168e4e092ab40dbc2503bdc0f62010e95d292b27827871dc85450d7ee \
--hash=sha256:1a4207639fb02ec2dbb76227d7c751a20b1a6b4bc52850568e52260cae64ca3b \
--hash=sha256:1c1fd185014191700554795c99b347d64f2bb637966c4cfc16998a0ca700d048 \
--hash=sha256:1e2cb691ed9834cd6a8be61228471d0a503731abfb42f82458ff27be7b2186fc \
--hash=sha256:1ebaf1d0481914d004a573394f4be3a7616334be70261007e47c2a6fe7e50130 \
--hash=sha256:220f892729375e2d736b97d0e51466252ad84c51857d4d15f5e9692f9ef12be4 \
--hash=sha256:251136cdad0cb722e93732cb45ca5299fb56e1344a833640bf93b2803f8d1bfd \
--hash=sha256:26f0d68d4b235a2bae0c3fc585c585b4ecc51382db0e3ba402a22cbc440915e4 \
--hash=sha256:26f32e0adf166a84d0cb63be85c562ca8a6fa8de28e5f0d92250c6b7e9e2aff7 \
--hash=sha256:280d219beebb0752699480fe8f1dc61ab6615c2046d76b7ab7ee38858de0a4e7 \
--hash=sha256:28ccb213807e037460326424ceb8b5245acb88f32f3d2777427476e1b32c48c4 \
--hash=sha256:2bf14caea37e91198329b828eae1618c068dfb8ef17bb33287a7ad4b61ac314e \
--hash=sha256:2d367ca20b2f14095a8f4fa1210f5a7b78b8a20009ecced6b12818f455b1e9fa \
--hash=sha256:30c5f68ded0c36466acede341551106821043e9afaad516adfb6e8fa80a4e6a6 \
--hash=sha256:337b443af21d488716f8d0b6164de833e788aa6bd7e3a39c005febc1284f4962 \
--hash=sha256:3911ac9284cd8a1792d3cb26a2da18f3ca26c6908cc434a18f730dc0db7bfa3b \
--hash=sha256:3d591580c34f4d731592f0e9fe40f9cc1b430d297eecc70b962e93c5c668f15f \
--hash=sha256:3de3ce3c9ddc8bbd88f6e0e304dea0e66d843ec9de1b0042b0911c1663ffd474 \
--hash=sha256:3de9961f2a346257caf0aa508a4da705467f53778e9ef6fe744c038119737ef5 \
--hash=sha256:40d02e7d45c9f8af700f3452f329ead92da4c5f4317ca9b896de7ce7199ea459 \
--hash=sha256:42c5f762659e47fdb7b16956c71598292f60a03aa92f8b6351504359dbdba6cf \
--hash=sha256:47956ae78b6422cbd46f772f1746799cbb862de838fd8d1fbd34a82e05b0983a \
--hash=sha256:491a2b73db93fab69731eaee494f320faa4e093dbed776be1a829c2eb222c34c \
--hash=sha256:4c9775e339e42e79ec99c441d9730fccf07414af63eac2f0e48e08fd38a64d76 \
--hash=sha256:4e0b4220ba5b40d727c7f879eac379b822eee5d8fff418e9d3381ee45b3b0362 \
--hash=sha256:50a68f3e3819077be2c98110c1f9dcb3817e93f267ba80a2c05bb4f8799e2ff4 \
--hash=sha256:519f29f5213271eeeeb3093f662ba2fd512b91c5f188f3bb7b27bc5973816934 \
--hash=sha256:521eb9b7f036c9b6187f0b47318ab0d7ca14bd87f776240b90b21c1f4f149320 \
--hash=sha256:57762139821c31847cfb2df63c12f725788bd9f04bc2fb392790959b8f70f118 \
--hash=sha256:5e4f4bb20d75e9325cc9696c6802657b58bc1dbbe3022f32cc2b2b632c3fbb96 \
--hash=sha256:5e68c4446fe0810e959cdff46ab0a41ce2f2c86d227d96dc3847af0ba7def306 \
--hash=sha256:669e193c1c576a58f132e3158f9dfa9662969edb1a250c54d8fa52590045f046 \
--hash=sha256:688d3fd9fcb71f41c4c015c023d12a79d1c4c0732ec9eb35d96e3388a120dcf3 \
--hash=sha256:6fb4aadc0b9a0c063206846d603b92030eb6f03069151a625667f982887153e2 \
--hash=sha256:7041c36f5680c6e0f08d922aed302e98b3745d97fe1589db0a3eebf6624523af \
--hash=sha256:71b24c7d61131bb83df10cc7e687433609963a944ccf45190cfc21e0887b08c9 \
--hash=sha256:77d1bca19b0f7021b3a982e6f903dcd5b2b06076def36a652e3907f596e29f67 \
--hash=sha256:7969e133a6f183be60e9f6f56bfae753585680f3b7307a8e555a948d443cc05a \
--hash=sha256:7a66efda2387de898c8f38c0cf7f14fca0b51a8ef0b24bfea5849f1b3c95af27 \
--hash=sha256:7d0c8399fcc1848491f00e0314bd59fb34a9c008761bcb422a057670c3f65e35 \
--hash=sha256:7d14bd329640e63852364c306f4d23eb744e0f8193148d4044dd3dacdaacbd8b \
--hash=sha256:7e17b560be3c98a8e3aa66ce828bdebb9e9ac6ad5466fba92eb74c4c95cb1151 \
--hash=sha256:8083d4e875ebe0b864ffef72a4304827015cff328a1be6e22cc850753bfb122b \
--hash=sha256:82f91663004eb8ed30ff478d77c4d1179b3563df6cdb15c0817cd1cdaf34d154 \
--hash=sha256:82f986faf4e644ffc189a7f1aafc86e46ef70372bb153e7001e8afccc6e54133 \
--hash=sha256:83097677b8e3bd7eaa6775720ec8e0405f1575015a463285a92bfdfe254529ef \
--hash=sha256:85210c4d99a0114f5a9481b44560d7d1e35e32cc5634c656bc48e590b669b145 \
--hash=sha256:8c19d1ea0673cd13cc2f872f6c9ab42acc4e4f492a7ca9d3795ce2b112dd7e15 \
--hash=sha256:8d9b3388db186ba0c099a6d20f0604a44eabdeef1777ddd94786cdae158729e4 \
--hash=sha256:8e10c99ef58cfdf2a66fc15d66b16c4a04f62bca39db589ae8cba08bc55331bc \
--hash=sha256:953101387ecf2f5652883208769a79e48db18c6df442568a0b5ccd8c2723abee \
--hash=sha256:9c3ed807c7b91de05e63930188f19e921d1fe90de6b4f5cd43ee7fcc3525cb8c \
--hash=sha256:9e0c8cfefa0ef83b4da9588448b6d8d2a2bf1a53c3f1ae5fca39eb3061e2f0b0 \
--hash=sha256:9fdbe7629b996647b99c01b37f11170a57ae675375b14b8c13b8518b8320ced5 \
--hash=sha256:a0fcd29cd6b4e74fe8ddd2c90330fd8edf2e30cb52acda47f06dd615ae72da57 \
--hash=sha256:ac4dbfd1691affb8f48c2c13241a2e3b60ff23247cbcf981759c768b6633cf8b \
--hash=sha256:b0cb791f5b45307caae8810c2023a184c74605ec3bcbb67d13846c28ff731ff8 \
--hash=sha256:ba5dd002f88b78a4215ed2f8ddbdf85e8513382820ba15ad5ad8955ce0ca19a1 \
--hash=sha256:bca101c00bff0adb45a833f8451b9105d9df18accb8743b08107d7ada14bd7da \
--hash=sha256:bd8086fa684c4775c27f03f062cbb9eaa6e17f064307e86b21b9e0abc9c0f02e \
--hash=sha256:bec317a27290e2537f922639cafd54990551725fc844249e64c523301d0822fc \
--hash=sha256:c10eb4f1659290b523af58fa7cffb452a61ad6ae5613404519aee4bfbf1df993 \
--hash=sha256:c33939a82924da9ed65dab5a65d427205a73181d8098e79b6b426bdf8ad4e656 \
--hash=sha256:c61709a844acc6bf0b7dce7daae75195a10aac96a596ea1b776996414791ede4 \
--hash=sha256:c70c26d2c99f78b125a3459f8afe1aed4d9687c24fd677c6a4436bc042e50d6c \
--hash=sha256:c817e2b40aba42bac6f457498dacabc568c3b7a986fc9ba7c8d9d260b71485fb \
--hash=sha256:cabb9bcb7e0d97f74df8646f34fc76fbf793b7f6dc2438517d7a9e50eee4f14d \
--hash=sha256:cc3f1a99a4f4f9dd1de4fe0312c114e740b5ddead65bb4102884b384c15d8bc9 \
--hash=sha256:cca63613e90d001b9f2f9a9ceb276c308bfa2a43fafb75c8031c4f66039e8c6e \
--hash=sha256:ce8918cbebc8da707ba805b7fd0b382816858728ae7fe19a942080c24e5b7cd1 \
--hash=sha256:d2088237af596f0a524d3afc39ab3b036e8adb054ee57cbb1dcf8e09da5b29cc \
--hash=sha256:d262606bf386a5ba0b0af3b97f37c83d7011439e3dc1a9298f21efb292e42f1a \
--hash=sha256:d2d63f1215638d28221f664596b1ccb3944f6e25dd18cd3b86b0a4c408d5ebb9 \
--hash=sha256:d3e8d504bdd3f10835468f29008d72fc8359d95c9c415ce6e767203db6127506 \
--hash=sha256:d4041c0b966a84b4ae7a09832eb691a35aec90910cd2dbe7a208de59be77965b \
--hash=sha256:d716e2e30c6f140d7560ef1538953a5cd1a87264c737643d481f2779fc247fe1 \
--hash=sha256:d81d2068e1c1228a565af076598f9e7451712700b673de8f502f0334f281387d \
--hash=sha256:d9640b0059ff4f14d1f37321b94061c6db164fbe49b334b31643e0528d100d99 \
--hash=sha256:de3cd1899e2c279b140adde9357c4495ed9d47131b4a4eaff9052f23398076b3 \
--hash=sha256:e0fd26b16394ead34a424eecf8a31a1f5137094cabe84a1bcb10fa6ba39d3d31 \
--hash=sha256:e2bb4d3e5873c37bb3dd58714d4cd0b0e6238cebc4177ac8fe878f8b3aa8e74c \
--hash=sha256:eb026e5a4c1fee05726072337ff51d1efb6f59090b7da90d30ea58625b1ffb39 \
--hash=sha256:eda3f5c2a021bbc5d976107bb302e0131351c2ba54343f8a496dc8783d3d3a6a \
--hash=sha256:ef592d4bad47296fb11f96cd7dc898b92e795032b4894dfb4076cfccd43a9308 \
--hash=sha256:f141ee28a0ad2123b6611b6ceff018039df17f32ada8b534e6aa039545a3efb2 \
--hash=sha256:f66d89ba397d92f840f8654756196d93804278457b5fbede59598a1f9f90b228 \
--hash=sha256:f6f8e111843bbb0dee4cb6594cdc73e79b3329b526037ec242a3e49012495b3b \
--hash=sha256:fa8e459d4954f608fa26116118bb67f56b93b209c39b008277ace29937453dc9 \
--hash=sha256:fd1aea04935a508f62e0d0ef1f5ae968774a32afc306fb8545e06f5ff5cdf3ad
# via pydantic
sniffio==1.3.1 \
--hash=sha256:2f6da418d1f1e0fddd844478f41680e794e6051915791a034ff65e5f100525a2 \
--hash=sha256:f4324edc670a0f49750a81b895f35c3adb843cca46f0530f79fc1babb23789dc
# via anyio
typing-extensions==4.12.2 \
--hash=sha256:04e5ca0351e0f3f85c6853954072df659d0d13fac324d0072316b67d7794700d \
--hash=sha256:1a7ead55c7e559dd4dee8856e3a88b41225abfe1ce8df57b7c13915fe121ffb8
# via
# annotated-types
# anyio
# pydantic
# pydantic-core
================================================
FILE: tests/test_client.py
================================================
import base64
import json
import os
import re
import tempfile
from pathlib import Path
from typing import Any
import pytest
from httpx import Response as httpxResponse
from pydantic import BaseModel
from pytest_httpserver import HTTPServer, URIPattern
from werkzeug.wrappers import Request, Response
from ollama._client import CONNECTION_ERROR_MESSAGE, AsyncClient, Client, _copy_tools
from ollama._types import Image, Message
PNG_BASE64 = 'iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGNgYGAAAAAEAAH2FzhVAAAAAElFTkSuQmCC'
PNG_BYTES = base64.b64decode(PNG_BASE64)
pytestmark = pytest.mark.anyio
@pytest.fixture
def anyio_backend():
return 'asyncio'
class PrefixPattern(URIPattern):
def __init__(self, prefix: str):
self.prefix = prefix
def match(self, uri):
return uri.startswith(self.prefix)
def test_client_chat(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/chat',
method='POST',
json={
'model': 'dummy',
'messages': [{'role': 'user', 'content': 'Why is the sky blue?'}],
'tools': [],
'stream': False,
},
).respond_with_json(
{
'model': 'dummy',
'message': {
'role': 'assistant',
'content': "I don't know.",
},
}
)
client = Client(httpserver.url_for('/'))
response = client.chat('dummy', messages=[{'role': 'user', 'content': 'Why is the sky blue?'}])
assert response['model'] == 'dummy'
assert response['message']['role'] == 'assistant'
assert response['message']['content'] == "I don't know."
def test_client_chat_with_logprobs(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/chat',
method='POST',
json={
'model': 'dummy',
'messages': [{'role': 'user', 'content': 'Hi'}],
'tools': [],
'stream': False,
'logprobs': True,
'top_logprobs': 3,
},
).respond_with_json(
{
'model': 'dummy',
'message': {
'role': 'assistant',
'content': 'Hello',
},
'logprobs': [
{
'token': 'Hello',
'logprob': -0.1,
'top_logprobs': [
{'token': 'Hello', 'logprob': -0.1},
{'token': 'Hi', 'logprob': -1.0},
],
}
],
}
)
client = Client(httpserver.url_for('/'))
response = client.chat('dummy', messages=[{'role': 'user', 'content': 'Hi'}], logprobs=True, top_logprobs=3)
assert response['logprobs'][0]['token'] == 'Hello'
assert response['logprobs'][0]['top_logprobs'][1]['token'] == 'Hi'
def test_client_chat_stream(httpserver: HTTPServer):
def stream_handler(_: Request):
def generate():
for message in ['I ', "don't ", 'know.']:
yield (
json.dumps(
{
'model': 'dummy',
'message': {
'role': 'assistant',
'content': message,
},
}
)
+ '\n'
)
return Response(generate())
httpserver.expect_ordered_request(
'/api/chat',
method='POST',
json={
'model': 'dummy',
'messages': [{'role': 'user', 'content': 'Why is the sky blue?'}],
'tools': [],
'stream': True,
},
).respond_with_handler(stream_handler)
client = Client(httpserver.url_for('/'))
response = client.chat('dummy', messages=[{'role': 'user', 'content': 'Why is the sky blue?'}], stream=True)
it = iter(['I ', "don't ", 'know.'])
for part in response:
assert part['message']['role'] in 'assistant'
assert part['message']['content'] == next(it)
@pytest.mark.parametrize('message_format', ('dict', 'pydantic_model'))
@pytest.mark.parametrize('file_style', ('path', 'bytes'))
def test_client_chat_images(httpserver: HTTPServer, message_format: str, file_style: str, tmp_path):
from ollama._types import Image, Message
httpserver.expect_ordered_request(
'/api/chat',
method='POST',
json={
'model': 'dummy',
'messages': [
{
'role': 'user',
'content': 'Why is the sky blue?',
'images': [PNG_BASE64],
},
],
'tools': [],
'stream': False,
},
).respond_with_json(
{
'model': 'dummy',
'message': {
'role': 'assistant',
'content': "I don't know.",
},
}
)
client = Client(httpserver.url_for('/'))
if file_style == 'bytes':
image_content = PNG_BYTES
elif file_style == 'path':
image_path = tmp_path / 'transparent.png'
image_path.write_bytes(PNG_BYTES)
image_content = str(image_path)
if message_format == 'pydantic_model':
messages = [Message(role='user', content='Why is the sky blue?', images=[Image(value=image_content)])]
elif message_format == 'dict':
messages = [{'role': 'user', 'content': 'Why is the sky blue?', 'images': [image_content]}]
else:
raise ValueError(f'Invalid message format: {message_format}')
response = client.chat('dummy', messages=messages)
assert response['model'] == 'dummy'
assert response['message']['role'] == 'assistant'
assert response['message']['content'] == "I don't know."
def test_client_chat_format_json(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/chat',
method='POST',
json={
'model': 'dummy',
'messages': [{'role': 'user', 'content': 'Why is the sky blue?'}],
'tools': [],
'format': 'json',
'stream': False,
},
).respond_with_json(
{
'model': 'dummy',
'message': {
'role': 'assistant',
'content': '{"answer": "Because of Rayleigh scattering"}',
},
}
)
client = Client(httpserver.url_for('/'))
response = client.chat('dummy', messages=[{'role': 'user', 'content': 'Why is the sky blue?'}], format='json')
assert response['model'] == 'dummy'
assert response['message']['role'] == 'assistant'
assert response['message']['content'] == '{"answer": "Because of Rayleigh scattering"}'
def test_client_chat_format_pydantic(httpserver: HTTPServer):
class ResponseFormat(BaseModel):
answer: str
confidence: float
httpserver.expect_ordered_request(
'/api/chat',
method='POST',
json={
'model': 'dummy',
'messages': [{'role': 'user', 'content': 'Why is the sky blue?'}],
'tools': [],
'format': {'title': 'ResponseFormat', 'type': 'object', 'properties': {'answer': {'title': 'Answer', 'type': 'string'}, 'confidence': {'title': 'Confidence', 'type': 'number'}}, 'required': ['answer', 'confidence']},
'stream': False,
},
).respond_with_json(
{
'model': 'dummy',
'message': {
'role': 'assistant',
'content': '{"answer": "Because of Rayleigh scattering", "confidence": 0.95}',
},
}
)
client = Client(httpserver.url_for('/'))
response = client.chat('dummy', messages=[{'role': 'user', 'content': 'Why is the sky blue?'}], format=ResponseFormat.model_json_schema())
assert response['model'] == 'dummy'
assert response['message']['role'] == 'assistant'
assert response['message']['content'] == '{"answer": "Because of Rayleigh scattering", "confidence": 0.95}'
async def test_async_client_chat_format_json(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/chat',
method='POST',
json={
'model': 'dummy',
'messages': [{'role': 'user', 'content': 'Why is the sky blue?'}],
'tools': [],
'format': 'json',
'stream': False,
},
).respond_with_json(
{
'model': 'dummy',
'message': {
'role': 'assistant',
'content': '{"answer": "Because of Rayleigh scattering"}',
},
}
)
client = AsyncClient(httpserver.url_for('/'))
response = await client.chat('dummy', messages=[{'role': 'user', 'content': 'Why is the sky blue?'}], format='json')
assert response['model'] == 'dummy'
assert response['message']['role'] == 'assistant'
assert response['message']['content'] == '{"answer": "Because of Rayleigh scattering"}'
async def test_async_client_chat_format_pydantic(httpserver: HTTPServer):
class ResponseFormat(BaseModel):
answer: str
confidence: float
httpserver.expect_ordered_request(
'/api/chat',
method='POST',
json={
'model': 'dummy',
'messages': [{'role': 'user', 'content': 'Why is the sky blue?'}],
'tools': [],
'format': {'title': 'ResponseFormat', 'type': 'object', 'properties': {'answer': {'title': 'Answer', 'type': 'string'}, 'confidence': {'title': 'Confidence', 'type': 'number'}}, 'required': ['answer', 'confidence']},
'stream': False,
},
).respond_with_json(
{
'model': 'dummy',
'message': {
'role': 'assistant',
'content': '{"answer": "Because of Rayleigh scattering", "confidence": 0.95}',
},
}
)
client = AsyncClient(httpserver.url_for('/'))
response = await client.chat('dummy', messages=[{'role': 'user', 'content': 'Why is the sky blue?'}], format=ResponseFormat.model_json_schema())
assert response['model'] == 'dummy'
assert response['message']['role'] == 'assistant'
assert response['message']['content'] == '{"answer": "Because of Rayleigh scattering", "confidence": 0.95}'
def test_client_generate(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/generate',
method='POST',
json={
'model': 'dummy',
'prompt': 'Why is the sky blue?',
'stream': False,
},
).respond_with_json(
{
'model': 'dummy',
'response': 'Because it is.',
}
)
client = Client(httpserver.url_for('/'))
response = client.generate('dummy', 'Why is the sky blue?')
assert response['model'] == 'dummy'
assert response['response'] == 'Because it is.'
def test_client_generate_with_logprobs(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/generate',
method='POST',
json={
'model': 'dummy',
'prompt': 'Why',
'stream': False,
'logprobs': True,
'top_logprobs': 2,
},
).respond_with_json(
{
'model': 'dummy',
'response': 'Hello',
'logprobs': [
{
'token': 'Hello',
'logprob': -0.2,
'top_logprobs': [
{'token': 'Hello', 'logprob': -0.2},
{'token': 'Hi', 'logprob': -1.5},
],
}
],
}
)
client = Client(httpserver.url_for('/'))
response = client.generate('dummy', 'Why', logprobs=True, top_logprobs=2)
assert response['logprobs'][0]['token'] == 'Hello'
assert response['logprobs'][0]['top_logprobs'][1]['token'] == 'Hi'
def test_client_generate_with_image_type(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/generate',
method='POST',
json={
'model': 'dummy',
'prompt': 'What is in this image?',
'stream': False,
'images': [PNG_BASE64],
},
).respond_with_json(
{
'model': 'dummy',
'response': 'A blue sky.',
}
)
client = Client(httpserver.url_for('/'))
response = client.generate('dummy', 'What is in this image?', images=[Image(value=PNG_BASE64)])
assert response['model'] == 'dummy'
assert response['response'] == 'A blue sky.'
def test_client_generate_with_invalid_image(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/generate',
method='POST',
json={
'model': 'dummy',
'prompt': 'What is in this image?',
'stream': False,
'images': ['invalid_base64'],
},
).respond_with_json({'error': 'Invalid image data'}, status=400)
client = Client(httpserver.url_for('/'))
with pytest.raises(ValueError):
client.generate('dummy', 'What is in this image?', images=[Image(value='invalid_base64')])
def test_client_generate_stream(httpserver: HTTPServer):
def stream_handler(_: Request):
def generate():
for message in ['Because ', 'it ', 'is.']:
yield (
json.dumps(
{
'model': 'dummy',
'response': message,
}
)
+ '\n'
)
return Response(generate())
httpserver.expect_ordered_request(
'/api/generate',
method='POST',
json={
'model': 'dummy',
'prompt': 'Why is the sky blue?',
'stream': True,
},
).respond_with_handler(stream_handler)
client = Client(httpserver.url_for('/'))
response = client.generate('dummy', 'Why is the sky blue?', stream=True)
it = iter(['Because ', 'it ', 'is.'])
for part in response:
assert part['model'] == 'dummy'
assert part['response'] == next(it)
def test_client_generate_images(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/generate',
method='POST',
json={
'model': 'dummy',
'prompt': 'Why is the sky blue?',
'stream': False,
'images': [PNG_BASE64],
},
).respond_with_json(
{
'model': 'dummy',
'response': 'Because it is.',
}
)
client = Client(httpserver.url_for('/'))
with tempfile.NamedTemporaryFile() as temp:
temp.write(PNG_BYTES)
temp.flush()
response = client.generate('dummy', 'Why is the sky blue?', images=[temp.name])
assert response['model'] == 'dummy'
assert response['response'] == 'Because it is.'
def test_client_generate_format_json(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/generate',
method='POST',
json={
'model': 'dummy',
'prompt': 'Why is the sky blue?',
'format': 'json',
'stream': False,
},
).respond_with_json(
{
'model': 'dummy',
'response': '{"answer": "Because of Rayleigh scattering"}',
}
)
client = Client(httpserver.url_for('/'))
response = client.generate('dummy', 'Why is the sky blue?', format='json')
assert response['model'] == 'dummy'
assert response['response'] == '{"answer": "Because of Rayleigh scattering"}'
def test_client_generate_format_pydantic(httpserver: HTTPServer):
class ResponseFormat(BaseModel):
answer: str
confidence: float
httpserver.expect_ordered_request(
'/api/generate',
method='POST',
json={
'model': 'dummy',
'prompt': 'Why is the sky blue?',
'format': {'title': 'ResponseFormat', 'type': 'object', 'properties': {'answer': {'title': 'Answer', 'type': 'string'}, 'confidence': {'title': 'Confidence', 'type': 'number'}}, 'required': ['answer', 'confidence']},
'stream': False,
},
).respond_with_json(
{
'model': 'dummy',
'response': '{"answer": "Because of Rayleigh scattering", "confidence": 0.95}',
}
)
client = Client(httpserver.url_for('/'))
response = client.generate('dummy', 'Why is the sky blue?', format=ResponseFormat.model_json_schema())
assert response['model'] == 'dummy'
assert response['response'] == '{"answer": "Because of Rayleigh scattering", "confidence": 0.95}'
async def test_async_client_generate_format_json(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/generate',
method='POST',
json={
'model': 'dummy',
'prompt': 'Why is the sky blue?',
'format': 'json',
'stream': False,
},
).respond_with_json(
{
'model': 'dummy',
'response': '{"answer": "Because of Rayleigh scattering"}',
}
)
client = AsyncClient(httpserver.url_for('/'))
response = await client.generate('dummy', 'Why is the sky blue?', format='json')
assert response['model'] == 'dummy'
assert response['response'] == '{"answer": "Because of Rayleigh scattering"}'
async def test_async_client_generate_format_pydantic(httpserver: HTTPServer):
class ResponseFormat(BaseModel):
answer: str
confidence: float
httpserver.expect_ordered_request(
'/api/generate',
method='POST',
json={
'model': 'dummy',
'prompt': 'Why is the sky blue?',
'format': {'title': 'ResponseFormat', 'type': 'object', 'properties': {'answer': {'title': 'Answer', 'type': 'string'}, 'confidence': {'title': 'Confidence', 'type': 'number'}}, 'required': ['answer', 'confidence']},
'stream': False,
},
).respond_with_json(
{
'model': 'dummy',
'response': '{"answer": "Because of Rayleigh scattering", "confidence": 0.95}',
}
)
client = AsyncClient(httpserver.url_for('/'))
response = await client.generate('dummy', 'Why is the sky blue?', format=ResponseFormat.model_json_schema())
assert response['model'] == 'dummy'
assert response['response'] == '{"answer": "Because of Rayleigh scattering", "confidence": 0.95}'
def test_client_generate_image(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/generate',
method='POST',
json={
'model': 'dummy-image',
'prompt': 'a sunset over mountains',
'stream': False,
'width': 1024,
'height': 768,
'steps': 20,
},
).respond_with_json(
{
'model': 'dummy-image',
'image': PNG_BASE64,
'done': True,
'done_reason': 'stop',
}
)
client = Client(httpserver.url_for('/'))
response = client.generate('dummy-image', 'a sunset over mountains', width=1024, height=768, steps=20)
assert response['model'] == 'dummy-image'
assert response['image'] == PNG_BASE64
assert response['done'] is True
def test_client_generate_image_stream(httpserver: HTTPServer):
def stream_handler(_: Request):
def generate():
# Progress updates
for i in range(1, 4):
yield (
json.dumps(
{
'model': 'dummy-image',
'completed': i,
'total': 3,
'done': False,
}
)
+ '\n'
)
# Final response with image
yield (
json.dumps(
{
'model': 'dummy-image',
'image': PNG_BASE64,
'done': True,
'done_reason': 'stop',
}
)
+ '\n'
)
return Response(generate())
httpserver.expect_ordered_request(
'/api/generate',
method='POST',
json={
'model': 'dummy-image',
'prompt': 'a sunset over mountains',
'stream': True,
'width': 512,
'height': 512,
},
).respond_with_handler(stream_handler)
client = Client(httpserver.url_for('/'))
response = client.generate('dummy-image', 'a sunset over mountains', stream=True, width=512, height=512)
parts = list(response)
# Check progress updates
assert parts[0]['completed'] == 1
assert parts[0]['total'] == 3
assert parts[0]['done'] is False
# Check final response
assert parts[-1]['image'] == PNG_BASE64
assert parts[-1]['done'] is True
async def test_async_client_generate_image(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/generate',
method='POST',
json={
'model': 'dummy-image',
'prompt': 'a robot painting',
'stream': False,
'width': 1024,
'height': 1024,
},
).respond_with_json(
{
'model': 'dummy-image',
'image': PNG_BASE64,
'done': True,
}
)
client = AsyncClient(httpserver.url_for('/'))
response = await client.generate('dummy-image', 'a robot painting', width=1024, height=1024)
assert response['model'] == 'dummy-image'
assert response['image'] == PNG_BASE64
def test_client_pull(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/pull',
method='POST',
json={
'model': 'dummy',
'insecure': False,
'stream': False,
},
).respond_with_json({'status': 'success'})
client = Client(httpserver.url_for('/'))
response = client.pull('dummy')
assert response['status'] == 'success'
def test_client_pull_stream(httpserver: HTTPServer):
def stream_handler(_: Request):
def generate():
yield json.dumps({'status': 'pulling manifest'}) + '\n'
yield json.dumps({'status': 'verifying sha256 digest'}) + '\n'
yield json.dumps({'status': 'writing manifest'}) + '\n'
yield json.dumps({'status': 'removing any unused layers'}) + '\n'
yield json.dumps({'status': 'success'}) + '\n'
return Response(generate())
httpserver.expect_ordered_request(
'/api/pull',
method='POST',
json={
'model': 'dummy',
'insecure': False,
'stream': True,
},
).respond_with_handler(stream_handler)
client = Client(httpserver.url_for('/'))
response = client.pull('dummy', stream=True)
it = iter(['pulling manifest', 'verifying sha256 digest', 'writing manifest', 'removing any unused layers', 'success'])
for part in response:
assert part['status'] == next(it)
def test_client_push(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/push',
method='POST',
json={
'model': 'dummy',
'insecure': False,
'stream': False,
},
).respond_with_json({'status': 'success'})
client = Client(httpserver.url_for('/'))
response = client.push('dummy')
assert response['status'] == 'success'
def test_client_push_stream(httpserver: HTTPServer):
def stream_handler(_: Request):
def generate():
yield json.dumps({'status': 'retrieving manifest'}) + '\n'
yield json.dumps({'status': 'pushing manifest'}) + '\n'
yield json.dumps({'status': 'success'}) + '\n'
return Response(generate())
httpserver.expect_ordered_request(
'/api/push',
method='POST',
json={
'model': 'dummy',
'insecure': False,
'stream': True,
},
).respond_with_handler(stream_handler)
client = Client(httpserver.url_for('/'))
response = client.push('dummy', stream=True)
it = iter(['retrieving manifest', 'pushing manifest', 'success'])
for part in response:
assert part['status'] == next(it)
@pytest.fixture
def userhomedir():
with tempfile.TemporaryDirectory() as temp:
home = os.getenv('HOME', '')
os.environ['HOME'] = temp
yield Path(temp)
os.environ['HOME'] = home
def test_client_create_with_blob(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/create',
method='POST',
json={
'model': 'dummy',
'files': {'test.gguf': 'sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'},
'stream': False,
},
).respond_with_json({'status': 'success'})
client = Client(httpserver.url_for('/'))
with tempfile.NamedTemporaryFile():
response = client.create('dummy', files={'test.gguf': 'sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'})
assert response['status'] == 'success'
def test_client_create_with_parameters_roundtrip(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/create',
method='POST',
json={
'model': 'dummy',
'quantize': 'q4_k_m',
'from': 'mymodel',
'adapters': {'someadapter.gguf': 'sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'},
'template': '[INST] <<SYS>>{{.System}}<</SYS>>\n{{.Prompt}} [/INST]',
'license': 'this is my license',
'system': '\nUse\nmultiline\nstrings.\n',
'parameters': {'stop': ['[INST]', '[/INST]', '<<SYS>>', '<</SYS>>'], 'pi': 3.14159},
'messages': [{'role': 'user', 'content': 'Hello there!'}, {'role': 'assistant', 'content': 'Hello there yourself!'}],
'stream': False,
},
).respond_with_json({'status': 'success'})
client = Client(httpserver.url_for('/'))
with tempfile.NamedTemporaryFile():
response = client.create(
'dummy',
quantize='q4_k_m',
from_='mymodel',
adapters={'someadapter.gguf': 'sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'},
template='[INST] <<SYS>>{{.System}}<</SYS>>\n{{.Prompt}} [/INST]',
license='this is my license',
system='\nUse\nmultiline\nstrings.\n',
parameters={'stop': ['[INST]', '[/INST]', '<<SYS>>', '<</SYS>>'], 'pi': 3.14159},
messages=[{'role': 'user', 'content': 'Hello there!'}, {'role': 'assistant', 'content': 'Hello there yourself!'}],
stream=False,
)
assert response['status'] == 'success'
def test_client_create_from_library(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/create',
method='POST',
json={
'model': 'dummy',
'from': 'llama2',
'stream': False,
},
).respond_with_json({'status': 'success'})
client = Client(httpserver.url_for('/'))
response = client.create('dummy', from_='llama2')
assert response['status'] == 'success'
def test_client_create_blob(httpserver: HTTPServer):
httpserver.expect_ordered_request(re.compile('^/api/blobs/sha256[:-][0-9a-fA-F]{64}$'), method='POST').respond_with_response(Response(status=201))
client = Client(httpserver.url_for('/'))
with tempfile.NamedTemporaryFile() as blob:
response = client.create_blob(blob.name)
assert response == 'sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'
def test_client_create_blob_exists(httpserver: HTTPServer):
httpserver.expect_ordered_request(PrefixPattern('/api/blobs/'), method='POST').respond_with_response(Response(status=200))
client = Client(httpserver.url_for('/'))
with tempfile.NamedTemporaryFile() as blob:
response = client.create_blob(blob.name)
assert response == 'sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'
def test_client_delete(httpserver: HTTPServer):
httpserver.expect_ordered_request(PrefixPattern('/api/delete'), method='DELETE').respond_with_response(Response(status=200))
client = Client(httpserver.url_for('/api/delete'))
response = client.delete('dummy')
assert response['status'] == 'success'
def test_client_copy(httpserver: HTTPServer):
httpserver.expect_ordered_request(PrefixPattern('/api/copy'), method='POST').respond_with_response(Response(status=200))
client = Client(httpserver.url_for('/api/copy'))
response = client.copy('dum', 'dummer')
assert response['status'] == 'success'
async def test_async_client_chat(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/chat',
method='POST',
json={
'model': 'dummy',
'messages': [{'role': 'user', 'content': 'Why is the sky blue?'}],
'tools': [],
'stream': False,
},
).respond_with_json(
{
'model': 'dummy',
'message': {
'role': 'assistant',
'content': "I don't know.",
},
}
)
client = AsyncClient(httpserver.url_for('/'))
response = await client.chat('dummy', messages=[{'role': 'user', 'content': 'Why is the sky blue?'}])
assert response['model'] == 'dummy'
assert response['message']['role'] == 'assistant'
assert response['message']['content'] == "I don't know."
async def test_async_client_chat_stream(httpserver: HTTPServer):
def stream_handler(_: Request):
def generate():
for message in ['I ', "don't ", 'know.']:
yield (
json.dumps(
{
'model': 'dummy',
'message': {
'role': 'assistant',
'content': message,
},
}
)
+ '\n'
)
return Response(generate())
httpserver.expect_ordered_request(
'/api/chat',
method='POST',
json={
'model': 'dummy',
'messages': [{'role': 'user', 'content': 'Why is the sky blue?'}],
'tools': [],
'stream': True,
},
).respond_with_handler(stream_handler)
client = AsyncClient(httpserver.url_for('/'))
response = await client.chat('dummy', messages=[{'role': 'user', 'content': 'Why is the sky blue?'}], stream=True)
it = iter(['I ', "don't ", 'know.'])
async for part in response:
assert part['message']['role'] == 'assistant'
assert part['message']['content'] == next(it)
async def test_async_client_chat_images(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/chat',
method='POST',
json={
'model': 'dummy',
'messages': [
{
'role': 'user',
'content': 'Why is the sky blue?',
'images': [PNG_BASE64],
},
],
'tools': [],
'stream': False,
},
).respond_with_json(
{
'model': 'dummy',
'message': {
'role': 'assistant',
'content': "I don't know.",
},
}
)
client = AsyncClient(httpserver.url_for('/'))
response = await client.chat('dummy', messages=[{'role': 'user', 'content': 'Why is the sky blue?', 'images': [PNG_BYTES]}])
assert response['model'] == 'dummy'
assert response['message']['role'] == 'assistant'
assert response['message']['content'] == "I don't know."
async def test_async_client_generate(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/generate',
method='POST',
json={
'model': 'dummy',
'prompt': 'Why is the sky blue?',
'stream': False,
},
).respond_with_json(
{
'model': 'dummy',
'response': 'Because it is.',
}
)
client = AsyncClient(httpserver.url_for('/'))
response = await client.generate('dummy', 'Why is the sky blue?')
assert response['model'] == 'dummy'
assert response['response'] == 'Because it is.'
async def test_async_client_generate_stream(httpserver: HTTPServer):
def stream_handler(_: Request):
def generate():
for message in ['Because ', 'it ', 'is.']:
yield (
json.dumps(
{
'model': 'dummy',
'response': message,
}
)
+ '\n'
)
return Response(generate())
httpserver.expect_ordered_request(
'/api/generate',
method='POST',
json={
'model': 'dummy',
'prompt': 'Why is the sky blue?',
'stream': True,
},
).respond_with_handler(stream_handler)
client = AsyncClient(httpserver.url_for('/'))
response = await client.generate('dummy', 'Why is the sky blue?', stream=True)
it = iter(['Because ', 'it ', 'is.'])
async for part in response:
assert part['model'] == 'dummy'
assert part['response'] == next(it)
async def test_async_client_generate_images(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/generate',
method='POST',
json={
'model': 'dummy',
'prompt': 'Why is the sky blue?',
'stream': False,
'images': [PNG_BASE64],
},
).respond_with_json(
{
'model': 'dummy',
'response': 'Because it is.',
}
)
client = AsyncClient(httpserver.url_for('/'))
with tempfile.NamedTemporaryFile() as temp:
temp.write(PNG_BYTES)
temp.flush()
response = await client.generate('dummy', 'Why is the sky blue?', images=[temp.name])
assert response['model'] == 'dummy'
assert response['response'] == 'Because it is.'
async def test_async_client_pull(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/pull',
method='POST',
json={
'model': 'dummy',
'insecure': False,
'stream': False,
},
).respond_with_json({'status': 'success'})
client = AsyncClient(httpserver.url_for('/'))
response = await client.pull('dummy')
assert response['status'] == 'success'
async def test_async_client_pull_stream(httpserver: HTTPServer):
def stream_handler(_: Request):
def generate():
yield json.dumps({'status': 'pulling manifest'}) + '\n'
yield json.dumps({'status': 'verifying sha256 digest'}) + '\n'
yield json.dumps({'status': 'writing manifest'}) + '\n'
yield json.dumps({'status': 'removing any unused layers'}) + '\n'
yield json.dumps({'status': 'success'}) + '\n'
return Response(generate())
httpserver.expect_ordered_request(
'/api/pull',
method='POST',
json={
'model': 'dummy',
'insecure': False,
'stream': True,
},
).respond_with_handler(stream_handler)
client = AsyncClient(httpserver.url_for('/'))
response = await client.pull('dummy', stream=True)
it = iter(['pulling manifest', 'verifying sha256 digest', 'writing manifest', 'removing any unused layers', 'success'])
async for part in response:
assert part['status'] == next(it)
async def test_async_client_push(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/push',
method='POST',
json={
'model': 'dummy',
'insecure': False,
'stream': False,
},
).respond_with_json({'status': 'success'})
client = AsyncClient(httpserver.url_for('/'))
response = await client.push('dummy')
assert response['status'] == 'success'
async def test_async_client_push_stream(httpserver: HTTPServer):
def stream_handler(_: Request):
def generate():
yield json.dumps({'status': 'retrieving manifest'}) + '\n'
yield json.dumps({'status': 'pushing manifest'}) + '\n'
yield json.dumps({'status': 'success'}) + '\n'
return Response(generate())
httpserver.expect_ordered_request(
'/api/push',
method='POST',
json={
'model': 'dummy',
'insecure': False,
'stream': True,
},
).respond_with_handler(stream_handler)
client = AsyncClient(httpserver.url_for('/'))
response = await client.push('dummy', stream=True)
it = iter(['retrieving manifest', 'pushing manifest', 'success'])
async for part in response:
assert part['status'] == next(it)
async def test_async_client_create_with_blob(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/create',
method='POST',
json={
'model': 'dummy',
'files': {'test.gguf': 'sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'},
'stream': False,
},
).respond_with_json({'status': 'success'})
client = AsyncClient(httpserver.url_for('/'))
with tempfile.NamedTemporaryFile():
response = await client.create('dummy', files={'test.gguf': 'sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'})
assert response['status'] == 'success'
async def test_async_client_create_with_parameters_roundtrip(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/create',
method='POST',
json={
'model': 'dummy',
'quantize': 'q4_k_m',
'from': 'mymodel',
'adapters': {'someadapter.gguf': 'sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'},
'template': '[INST] <<SYS>>{{.System}}<</SYS>>\n{{.Prompt}} [/INST]',
'license': 'this is my license',
'system': '\nUse\nmultiline\nstrings.\n',
'parameters': {'stop': ['[INST]', '[/INST]', '<<SYS>>', '<</SYS>>'], 'pi': 3.14159},
'messages': [{'role': 'user', 'content': 'Hello there!'}, {'role': 'assistant', 'content': 'Hello there yourself!'}],
'stream': False,
},
).respond_with_json({'status': 'success'})
client = AsyncClient(httpserver.url_for('/'))
with tempfile.NamedTemporaryFile():
response = await client.create(
'dummy',
quantize='q4_k_m',
from_='mymodel',
adapters={'someadapter.gguf': 'sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'},
template='[INST] <<SYS>>{{.System}}<</SYS>>\n{{.Prompt}} [/INST]',
license='this is my license',
system='\nUse\nmultiline\nstrings.\n',
parameters={'stop': ['[INST]', '[/INST]', '<<SYS>>', '<</SYS>>'], 'pi': 3.14159},
messages=[{'role': 'user', 'content': 'Hello there!'}, {'role': 'assistant', 'content': 'Hello there yourself!'}],
stream=False,
)
assert response['status'] == 'success'
async def test_async_client_create_from_library(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/create',
method='POST',
json={
'model': 'dummy',
'from': 'llama2',
'stream': False,
},
).respond_with_json({'status': 'success'})
client = AsyncClient(httpserver.url_for('/'))
response = await client.create('dummy', from_='llama2')
assert response['status'] == 'success'
async def test_async_client_create_blob(httpserver: HTTPServer):
httpserver.expect_ordered_request(re.compile('^/api/blobs/sha256[:-][0-9a-fA-F]{64}$'), method='POST').respond_with_response(Response(status=201))
client = AsyncClient(httpserver.url_for('/'))
with tempfile.NamedTemporaryFile() as blob:
response = await client.create_blob(blob.name)
assert response == 'sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'
async def test_async_client_create_blob_exists(httpserver: HTTPServer):
httpserver.expect_ordered_request(PrefixPattern('/api/blobs/'), method='POST').respond_with_response(Response(status=200))
client = AsyncClient(httpserver.url_for('/'))
with tempfile.NamedTemporaryFile() as blob:
response = await client.create_blob(blob.name)
assert response == 'sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'
async def test_async_client_delete(httpserver: HTTPServer):
httpserver.expect_ordered_request(PrefixPattern('/api/delete'), method='DELETE').respond_with_response(Response(status=200))
client = AsyncClient(httpserver.url_for('/api/delete'))
response = await client.delete('dummy')
assert response['status'] == 'success'
async def test_async_client_copy(httpserver: HTTPServer):
httpserver.expect_ordered_request(PrefixPattern('/api/copy'), method='POST').respond_with_response(Response(status=200))
client = AsyncClient(httpserver.url_for('/api/copy'))
response = await client.copy('dum', 'dummer')
assert response['status'] == 'success'
def test_headers():
client = Client()
assert client._client.headers['content-type'] == 'application/json'
assert client._client.headers['accept'] == 'application/json'
assert client._client.headers['user-agent'].startswith('ollama-python/')
client = Client(
headers={
'X-Custom': 'value',
'Content-Type': 'text/plain',
}
)
assert client._client.headers['x-custom'] == 'value'
assert client._client.headers['content-type'] == 'application/json'
def test_copy_tools():
def func1(x: int) -> str:
"""Simple function 1.
Args:
x (integer): A number
"""
def func2(y: str) -> int:
"""Simple function 2.
Args:
y (string): A string
"""
# Test with list of functions
tools = list(_copy_tools([func1, func2]))
assert len(tools) == 2
assert tools[0].function.name == 'func1'
assert tools[1].function.name == 'func2'
# Test with empty input
assert list(_copy_tools()) == []
assert list(_copy_tools(None)) == []
assert list(_copy_tools([])) == []
# Test with mix of functions and tool dicts
tool_dict = {
'type': 'function',
'function': {
'name': 'test',
'description': 'Test function',
'parameters': {
'type': 'object',
'properties': {'x': {'type': 'string', 'description': 'A string', 'enum': ['a', 'b', 'c']}, 'y': {'type': ['integer', 'number'], 'description': 'An integer'}},
'required': ['x'],
},
},
}
tools = list(_copy_tools([func1, tool_dict]))
assert len(tools) == 2
assert tools[0].function.name == 'func1'
assert tools[1].function.name == 'test'
def test_tool_validation():
arbitrary_tool = {'type': 'custom_type', 'function': {'name': 'test'}}
tools = list(_copy_tools([arbitrary_tool]))
assert len(tools) == 1
assert tools[0].type == 'custom_type'
assert tools[0].function.name == 'test'
def test_client_connection_error():
client = Client('http://localhost:1234')
with pytest.raises(ConnectionError, match=CONNECTION_ERROR_MESSAGE):
client.chat('model', messages=[{'role': 'user', 'content': 'prompt'}])
with pytest.raises(ConnectionError, match=CONNECTION_ERROR_MESSAGE):
client.chat('model', messages=[{'role': 'user', 'content': 'prompt'}])
with pytest.raises(ConnectionError, match=CONNECTION_ERROR_MESSAGE):
client.generate('model', 'prompt')
with pytest.raises(ConnectionError, match=CONNECTION_ERROR_MESSAGE):
client.show('model')
async def test_async_client_connection_error():
client = AsyncClient('http://localhost:1234')
with pytest.raises(ConnectionError) as exc_info:
await client.chat('model', messages=[{'role': 'user', 'content': 'prompt'}])
assert str(exc_info.value) == 'Failed to connect to Ollama. Please check that Ollama is downloaded, running and accessible. https://ollama.com/download'
with pytest.raises(ConnectionError) as exc_info:
await client.generate('model', 'prompt')
assert str(exc_info.value) == 'Failed to connect to Ollama. Please check that Ollama is downloaded, running and accessible. https://ollama.com/download'
with pytest.raises(ConnectionError) as exc_info:
await client.show('model')
assert str(exc_info.value) == 'Failed to connect to Ollama. Please check that Ollama is downloaded, running and accessible. https://ollama.com/download'
def test_arbitrary_roles_accepted_in_message():
_ = Message(role='somerandomrole', content="I'm ok with you adding any role message now!")
def _mock_request(*args: Any, **kwargs: Any) -> Response:
return httpxResponse(status_code=200, content="{'response': 'Hello world!'}")
def test_arbitrary_roles_accepted_in_message_request(monkeypatch: pytest.MonkeyPatch):
monkeypatch.setattr(Client, '_request', _mock_request)
client = Client()
client.chat(model='llama3.1', messages=[{'role': 'somerandomrole', 'content': "I'm ok with you adding any role message now!"}, {'role': 'user', 'content': 'Hello world!'}])
async def _mock_request_async(*args: Any, **kwargs: Any) -> Response:
return httpxResponse(status_code=200, content="{'response': 'Hello world!'}")
async def test_arbitrary_roles_accepted_in_message_request_async(monkeypatch: pytest.MonkeyPatch):
monkeypatch.setattr(AsyncClient, '_request', _mock_request_async)
client = AsyncClient()
await client.chat(model='llama3.1', messages=[{'role': 'somerandomrole', 'content': "I'm ok with you adding any role message now!"}, {'role': 'user', 'content': 'Hello world!'}])
def test_client_web_search_requires_bearer_auth_header(monkeypatch: pytest.MonkeyPatch):
monkeypatch.delenv('OLLAMA_API_KEY', raising=False)
client = Client()
with pytest.raises(ValueError, match='Authorization header with Bearer token is required for web search'):
client.web_search('test query')
def test_client_web_fetch_requires_bearer_auth_header(monkeypatch: pytest.MonkeyPatch):
monkeypatch.delenv('OLLAMA_API_KEY', raising=False)
client = Client()
with pytest.raises(ValueError, match='Authorization header with Bearer token is required for web fetch'):
client.web_fetch('https://example.com')
def _mock_request_web_search(self, cls, method, url, json=None, **kwargs):
assert method == 'POST'
assert url == 'https://ollama.com/api/web_search'
assert json is not None and 'query' in json and 'max_results' in json
return httpxResponse(status_code=200, content='{"results": {}, "success": true}')
def _mock_request_web_fetch(self, cls, method, url, json=None, **kwargs):
assert method == 'POST'
assert url == 'https://ollama.com/api/web_fetch'
assert json is not None and 'url' in json
return httpxResponse(status_code=200, content='{"results": {}, "success": true}')
def test_client_web_search_with_env_api_key(monkeypatch: pytest.MonkeyPatch):
monkeypatch.setenv('OLLAMA_API_KEY', 'test-key')
monkeypatch.setattr(Client, '_request', _mock_request_web_search)
client = Client()
client.web_search('what is ollama?', max_results=2)
def test_client_web_fetch_with_env_api_key(monkeypatch: pytest.MonkeyPatch):
monkeypatch.setenv('OLLAMA_API_KEY', 'test-key')
monkeypatch.setattr(Client, '_request', _mock_request_web_fetch)
client = Client()
client.web_fetch('https://example.com')
def test_client_web_search_with_explicit_bearer_header(monkeypatch: pytest.MonkeyPatch):
monkeypatch.delenv('OLLAMA_API_KEY', raising=False)
monkeypatch.setattr(Client, '_request', _mock_request_web_search)
client = Client(headers={'Authorization': 'Bearer custom-token'})
client.web_search('what is ollama?', max_results=1)
def test_client_web_fetch_with_explicit_bearer_header(monkeypatch: pytest.MonkeyPatch):
monkeypatch.delenv('OLLAMA_API_KEY', raising=False)
monkeypatch.setattr(Client, '_request', _mock_request_web_fetch)
client = Client(headers={'Authorization': 'Bearer custom-token'})
client.web_fetch('https://example.com')
def test_client_bearer_header_from_env(monkeypatch: pytest.MonkeyPatch):
monkeypatch.setenv('OLLAMA_API_KEY', 'env-token')
client = Client()
assert client._client.headers['authorization'] == 'Bearer env-token'
def test_client_explicit_bearer_header_overrides_env(monkeypatch: pytest.MonkeyPatch):
monkeypatch.setenv('OLLAMA_API_KEY', 'env-token')
monkeypatch.setattr(Client, '_request', _mock_request_web_search)
client = Client(headers={'Authorization': 'Bearer explicit-token'})
assert client._client.headers['authorization'] == 'Bearer explicit-token'
client.web_search('override check')
def test_client_close():
client = Client()
client.close()
assert client._client.is_closed
@pytest.mark.anyio
async def test_async_client_close():
client = AsyncClient()
await client.close()
assert client._client.is_closed
def test_client_context_manager():
with Client() as client:
assert isinstance(client, Client)
assert not client._client.is_closed
assert client._client.is_closed
@pytest.mark.anyio
async def test_async_client_context_manager():
async with AsyncClient() as client:
assert isinstance(client, AsyncClient)
assert not client._client.is_closed
assert client._client.is_closed
================================================
FILE: tests/test_type_serialization.py
================================================
import tempfile
from base64 import b64encode
from pathlib import Path
import pytest
from ollama._types import CreateRequest, Image
def test_image_serialization_bytes():
image_bytes = b'test image bytes'
encoded_string = b64encode(image_bytes).decode()
img = Image(value=image_bytes)
assert img.model_dump() == encoded_string
def test_image_serialization_base64_string():
b64_str = 'dGVzdCBiYXNlNjQgc3RyaW5n'
img = Image(value=b64_str)
assert img.model_dump() == b64_str # Should return as-is if valid base64
def test_image_serialization_long_base64_string():
b64_str = 'dGVzdCBiYXNlNjQgc3RyaW5n' * 1000
img = Image(value=b64_str)
assert img.model_dump() == b64_str # Should return as-is if valid base64
def test_image_serialization_plain_string():
img = Image(value='not a path or base64')
assert img.model_dump() == 'not a path or base64' # Should return as-is
def test_image_serialization_path():
with tempfile.NamedTemporaryFile() as temp_file:
temp_file.write(b'test file content')
temp_file.flush()
img = Image(value=Path(temp_file.name))
assert img.model_dump() == b64encode(b'test file content').decode()
def test_image_serialization_string_path():
with tempfile.NamedTemporaryFile() as temp_file:
temp_file.write(b'test file content')
temp_file.flush()
img = Image(value=temp_file.name)
assert img.model_dump() == b64encode(b'test file content').decode()
with pytest.raises(ValueError):
img = Image(value='some_path/that/does/not/exist.png')
img.model_dump()
with pytest.raises(ValueError):
img = Image(value='not an image')
img.model_dump()
def test_create_request_serialization():
request = CreateRequest(model='test-model', from_='base-model', quantize='q4_0', files={'file1': 'content1'}, adapters={'adapter1': 'content1'}, template='test template', license='MIT', system='test system', parameters={'param1': 'value1'})
serialized = request.model_dump()
assert serialized['from'] == 'base-model'
assert 'from_' not in serialized
assert serialized['quantize'] == 'q4_0'
assert serialized['files'] == {'file1': 'content1'}
assert serialized['adapters'] == {'adapter1': 'content1'}
assert serialized['template'] == 'test template'
assert serialized['license'] == 'MIT'
assert serialized['system'] == 'test system'
assert serialized['parameters'] == {'param1': 'value1'}
def test_create_request_serialization_exclude_none_true():
request = CreateRequest(model='test-model', from_=None, quantize=None)
serialized = request.model_dump(exclude_none=True)
assert serialized == {'model': 'test-model'}
assert 'from' not in serialized
assert 'from_' not in serialized
assert 'quantize' not in serialized
def test_create_request_serialization_exclude_none_false():
request = CreateRequest(model='test-model', from_=None, quantize=None)
serialized = request.model_dump(exclude_none=False)
assert 'from' in serialized
assert 'quantize' in serialized
assert 'adapters' in serialized
assert 'from_' not in serialized
def test_create_request_serialization_license_list():
request = CreateRequest(model='test-model', license=['MIT', 'Apache-2.0'])
serialized = request.model_dump()
assert serialized['license'] == ['MIT', 'Apache-2.0']
================================================
FILE: tests/test_utils.py
================================================
import json
import sys
from typing import Dict, List, Mapping, Sequence, Set, Tuple, Union
from ollama._utils import convert_function_to_tool
def test_function_to_tool_conversion():
def add_numbers(x: int, y: Union[int, None] = None) -> int:
"""Add two numbers together.
args:
x (integer): The first number
y (integer, optional): The second number
Returns:
integer: The sum of x and y
"""
return x + y
tool = convert_function_to_tool(add_numbers).model_dump()
assert tool['type'] == 'function'
assert tool['function']['name'] == 'add_numbers'
assert tool['function']['description'] == 'Add two numbers together.'
assert tool['function']['parameters']['type'] == 'object'
assert tool['function']['parameters']['properties']['x']['type'] == 'integer'
assert tool['function']['parameters']['properties']['x']['description'] == 'The first number'
assert tool['function']['parameters']['required'] == ['x']
def test_function_with_no_args():
def simple_func():
"""
A simple function with no arguments.
Args:
None
Returns:
None
"""
tool = convert_function_to_tool(simple_func).model_dump()
assert tool['function']['name'] == 'simple_func'
assert tool['function']['description'] == 'A simple function with no arguments.'
assert tool['function']['parameters']['properties'] == {}
def test_function_with_all_types():
if sys.version_info >= (3, 10):
def all_types(
x: int,
y: str,
z: list[int],
w: dict[str, int],
v: int | str | None,
) -> int | dict[str, int] | str | list[int] | None:
"""
A function with all types.
Args:
x (integer): The first number
y (string): The second number
z (array): The third number
w (object): The fourth number
v (integer | string | None): The fifth number
"""
else:
def all_types(
x: int,
y: str,
z: Sequence,
w: Mapping[str, int],
d: Dict[str, int],
s: Set[int],
t: Tuple[int, str],
l: List[int], # noqa: E741
o: Union[int, None],
) -> Union[Mapping[str, int], str, None]:
"""
A function with all types.
Args:
x (integer): The first number
y (string): The second number
z (array): The third number
w (object): The fourth number
d (object): The fifth number
s (array): The sixth number
t (array): The seventh number
l (array): The eighth number
o (integer | None): The ninth number
"""
tool_json = convert_function_to_tool(all_types).model_dump_json()
tool = json.loads(tool_json)
assert tool['function']['parameters']['properties']['x']['type'] == 'integer'
assert tool['function']['parameters']['properties']['y']['type'] == 'string'
if sys.version_info >= (3, 10):
assert tool['function']['parameters']['properties']['z']['type'] == 'array'
assert tool['function']['parameters']['properties']['w']['type'] == 'object'
assert {x.strip().strip("'") for x in tool['function']['parameters']['properties']['v']['type'].removeprefix('[').removesuffix(']').split(',')} == {'string', 'integer'}
assert tool['function']['parameters']['properties']['v']['type'] != 'null'
assert tool['function']['parameters']['required'] == ['x', 'y', 'z', 'w']
else:
assert tool['function']['parameters']['properties']['z']['type'] == 'array'
assert tool['function']['parameters']['properties']['w']['type'] == 'object'
assert tool['function']['parameters']['properties']['d']['type'] == 'object'
assert tool['function']['parameters']['properties']['s']['type'] == 'array'
assert tool['function']['parameters']['properties']['t']['type'] == 'array'
assert tool['function']['parameters']['properties']['l']['type'] == 'array'
assert tool['function']['parameters']['properties']['o']['type'] == 'integer'
assert tool['function']['parameters']['properties']['o']['type'] != 'null'
assert tool['function']['parameters']['required'] == ['x', 'y', 'z', 'w', 'd', 's', 't', 'l']
def test_function_docstring_parsing():
from typing import Any, Dict, List
def func_with_complex_docs(x: int, y: List[str]) -> Dict[str, Any]:
"""
Test function with complex docstring.
Args:
x (integer): A number
with multiple lines
y (array of string): A list
with multiple lines
Returns:
object: A dictionary
with multiple lines
"""
tool = convert_function_to_tool(func_with_complex_docs).model_dump()
assert tool['function']['description'] == 'Test function with complex docstring.'
asse
gitextract_5oebzo1m/
├── .github/
│ ├── dependabot.yml
│ └── workflows/
│ ├── publish.yaml
│ └── test.yaml
├── .gitignore
├── LICENSE
├── README.md
├── SECURITY.md
├── examples/
│ ├── README.md
│ ├── async-chat.py
│ ├── async-generate.py
│ ├── async-structured-outputs.py
│ ├── async-tools.py
│ ├── chat-logprobs.py
│ ├── chat-stream.py
│ ├── chat-with-history.py
│ ├── chat.py
│ ├── create.py
│ ├── embed.py
│ ├── fill-in-middle.py
│ ├── generate-image.py
│ ├── generate-logprobs.py
│ ├── generate-stream.py
│ ├── generate.py
│ ├── gpt-oss-tools-stream.py
│ ├── gpt-oss-tools.py
│ ├── list.py
│ ├── multi-tool.py
│ ├── multimodal-chat.py
│ ├── multimodal-generate.py
│ ├── ps.py
│ ├── pull.py
│ ├── show.py
│ ├── structured-outputs-image.py
│ ├── structured-outputs.py
│ ├── thinking-generate.py
│ ├── thinking-levels.py
│ ├── thinking.py
│ ├── tools.py
│ ├── web-search-gpt-oss.py
│ ├── web-search-mcp.py
│ ├── web-search.py
│ └── web_search_gpt_oss_helper.py
├── ollama/
│ ├── __init__.py
│ ├── _client.py
│ ├── _types.py
│ ├── _utils.py
│ └── py.typed
├── pyproject.toml
├── requirements.txt
└── tests/
├── test_client.py
├── test_type_serialization.py
└── test_utils.py
SYMBOL INDEX (293 symbols across 23 files)
FILE: examples/async-chat.py
function main (line 6) | async def main():
FILE: examples/async-generate.py
function main (line 6) | async def main():
FILE: examples/async-structured-outputs.py
class FriendInfo (line 9) | class FriendInfo(BaseModel):
class FriendList (line 15) | class FriendList(BaseModel):
function main (line 19) | async def main():
FILE: examples/async-tools.py
function add_two_numbers (line 7) | def add_two_numbers(a: int, b: int) -> int:
function subtract_two_numbers (line 21) | def subtract_two_numbers(a: int, b: int) -> int:
function main (line 54) | async def main():
FILE: examples/chat-logprobs.py
function print_logprobs (line 6) | def print_logprobs(logprobs: Iterable[dict], label: str) -> None:
FILE: examples/generate-logprobs.py
function print_logprobs (line 6) | def print_logprobs(logprobs: Iterable[dict], label: str) -> None:
FILE: examples/gpt-oss-tools-stream.py
function get_weather (line 18) | def get_weather(city: str) -> str:
function get_weather_conditions (line 35) | def get_weather_conditions(city: str) -> str:
FILE: examples/gpt-oss-tools.py
function get_weather (line 17) | def get_weather(city: str) -> str:
function get_weather_conditions (line 34) | def get_weather_conditions(city: str) -> str:
FILE: examples/multi-tool.py
function get_temperature (line 7) | def get_temperature(city: str) -> int:
function get_conditions (line 26) | def get_conditions(city: str) -> str:
FILE: examples/structured-outputs-image.py
class Object (line 10) | class Object(BaseModel):
class ImageDescription (line 16) | class ImageDescription(BaseModel):
FILE: examples/structured-outputs.py
class FriendInfo (line 7) | class FriendInfo(BaseModel):
class FriendList (line 13) | class FriendList(BaseModel):
FILE: examples/thinking-levels.py
function heading (line 4) | def heading(text):
FILE: examples/tools.py
function add_two_numbers (line 4) | def add_two_numbers(a: int, b: int) -> int:
function subtract_two_numbers (line 21) | def subtract_two_numbers(a: int, b: int) -> int:
FILE: examples/web-search-gpt-oss.py
function main (line 14) | def main() -> None:
FILE: examples/web-search-mcp.py
function _web_search_impl (line 40) | def _web_search_impl(query: str, max_results: int = 3) -> Dict[str, Any]:
function _web_fetch_impl (line 45) | def _web_fetch_impl(url: str) -> Dict[str, Any]:
function web_search (line 54) | def web_search(query: str, max_results: int = 3) -> Dict[str, Any]:
function web_fetch (line 69) | def web_fetch(url: str) -> Dict[str, Any]:
function web_search (line 89) | async def web_search(query: str, max_results: int = 3) -> Dict[str, Any]:
function web_fetch (line 101) | async def web_fetch(url: str) -> Dict[str, Any]:
function _main (line 111) | async def _main() -> None:
FILE: examples/web-search.py
function format_tool_results (line 15) | def format_tool_results(
FILE: examples/web_search_gpt_oss_helper.py
class Page (line 13) | class Page:
class BrowserStateData (line 23) | class BrowserStateData:
class WebSearchResult (line 30) | class WebSearchResult:
class SearchClient (line 36) | class SearchClient(Protocol):
method search (line 37) | def search(self, queries: List[str], max_results: Optional[int] = None...
class CrawlClient (line 40) | class CrawlClient(Protocol):
method crawl (line 41) | def crawl(self, urls: List[str]): ...
function cap_tool_content (line 52) | def cap_tool_content(text: str) -> str:
function _safe_domain (line 62) | def _safe_domain(u: str) -> str:
class BrowserState (line 74) | class BrowserState:
method __init__ (line 75) | def __init__(self, initial_state: Optional[BrowserStateData] = None):
method get_data (line 78) | def get_data(self) -> BrowserStateData:
method set_data (line 81) | def set_data(self, data: BrowserStateData) -> None:
class Browser (line 88) | class Browser:
method __init__ (line 89) | def __init__(
method set_client (line 97) | def set_client(self, client: Client) -> None:
method get_state (line 100) | def get_state(self) -> BrowserStateData:
method _save_page (line 105) | def _save_page(self, page: Page) -> None:
method _page_from_stack (line 111) | def _page_from_stack(self, url: str) -> Page:
method _join_lines_with_numbers (line 118) | def _join_lines_with_numbers(self, lines: List[str]) -> str:
method _wrap_lines (line 124) | def _wrap_lines(self, text: str, width: int = 80) -> List[str]:
method _process_markdown_links (line 151) | def _process_markdown_links(self, text: str) -> Tuple[str, Dict[int, s...
method _get_end_loc (line 174) | def _get_end_loc(self, loc: int, num_lines: int, total_lines: int, lin...
method _display_page (line 183) | def _display_page(self, page: Page, cursor: int, loc: int, num_lines: ...
method _build_search_results_page_collection (line 208) | def _build_search_results_page_collection(self, query: str, results: D...
method _build_search_result_page (line 242) | def _build_search_result_page(self, result: WebSearchResult, link_idx:...
method _build_page_from_fetch (line 271) | def _build_page_from_fetch(self, requested_url: str, fetch_response: D...
method _build_find_results_page (line 302) | def _build_find_results_page(self, pattern: str, page: Page) -> Page:
method search (line 343) | def search(self, *, query: str, topn: int = 5) -> Dict[str, Any]:
method open (line 381) | def open(
method find (line 497) | def find(self, *, pattern: str, cursor: int = -1) -> Dict[str, Any]:
FILE: ollama/_client.py
class BaseClient (line 79) | class BaseClient(contextlib.AbstractContextManager, contextlib.AbstractA...
method __init__ (line 80) | def __init__(
method __exit__ (line 120) | def __exit__(self, exc_type, exc_val, exc_tb):
method __aexit__ (line 123) | async def __aexit__(self, exc_type, exc_val, exc_tb):
class Client (line 130) | class Client(BaseClient):
method __init__ (line 131) | def __init__(self, host: Optional[str] = None, **kwargs) -> None:
method close (line 134) | def close(self):
method _request_raw (line 137) | def _request_raw(self, *args, **kwargs):
method _request (line 148) | def _request(
method _request (line 157) | def _request(
method _request (line 166) | def _request(
method _request (line 174) | def _request(
method generate (line 202) | def generate(
method generate (line 226) | def generate(
method generate (line 249) | def generate(
method chat (line 309) | def chat(
method chat (line 325) | def chat(
method chat (line 340) | def chat(
method embed (line 406) | def embed(
method embeddings (line 429) | def embeddings(
method pull (line 452) | def pull(
method pull (line 461) | def pull(
method pull (line 469) | def pull(
method push (line 494) | def push(
method push (line 503) | def push(
method push (line 511) | def push(
method create (line 536) | def create(
method create (line 553) | def create(
method create (line 569) | def create(
method create_blob (line 609) | def create_blob(self, path: Union[str, Path]) -> str:
method list (line 625) | def list(self) -> ListResponse:
method delete (line 632) | def delete(self, model: str) -> StatusResponse:
method copy (line 644) | def copy(self, source: str, destination: str) -> StatusResponse:
method show (line 657) | def show(self, model: str) -> ShowResponse:
method ps (line 667) | def ps(self) -> ProcessResponse:
method web_search (line 674) | def web_search(self, query: str, max_results: int = 3) -> WebSearchRes...
method web_fetch (line 700) | def web_fetch(self, url: str) -> WebFetchResponse:
class AsyncClient (line 723) | class AsyncClient(BaseClient):
method __init__ (line 724) | def __init__(self, host: Optional[str] = None, **kwargs) -> None:
method close (line 727) | async def close(self):
method _request_raw (line 730) | async def _request_raw(self, *args, **kwargs):
method _request (line 741) | async def _request(
method _request (line 750) | async def _request(
method _request (line 759) | async def _request(
method _request (line 767) | async def _request(
method web_search (line 794) | async def web_search(self, query: str, max_results: int = 3) -> WebSea...
method web_fetch (line 815) | async def web_fetch(self, url: str) -> WebFetchResponse:
method generate (line 835) | async def generate(
method generate (line 859) | async def generate(
method generate (line 882) | async def generate(
method chat (line 941) | async def chat(
method chat (line 957) | async def chat(
method chat (line 972) | async def chat(
method embed (line 1039) | async def embed(
method embeddings (line 1062) | async def embeddings(
method pull (line 1085) | async def pull(
method pull (line 1094) | async def pull(
method pull (line 1102) | async def pull(
method push (line 1127) | async def push(
method push (line 1136) | async def push(
method push (line 1144) | async def push(
method create (line 1169) | async def create(
method create (line 1186) | async def create(
method create (line 1202) | async def create(
method create_blob (line 1243) | async def create_blob(self, path: Union[str, Path]) -> str:
method list (line 1266) | async def list(self) -> ListResponse:
method delete (line 1273) | async def delete(self, model: str) -> StatusResponse:
method copy (line 1285) | async def copy(self, source: str, destination: str) -> StatusResponse:
method show (line 1298) | async def show(self, model: str) -> ShowResponse:
method ps (line 1308) | async def ps(self) -> ProcessResponse:
function _copy_images (line 1316) | def _copy_images(images: Optional[Sequence[Union[Image, Any]]]) -> Itera...
function _copy_messages (line 1321) | def _copy_messages(messages: Optional[Sequence[Union[Mapping[str, Any], ...
function _copy_tools (line 1328) | def _copy_tools(tools: Optional[Sequence[Union[Mapping[str, Any], Tool, ...
function _as_path (line 1333) | def _as_path(s: Optional[Union[str, PathLike]]) -> Union[Path, None]:
function _parse_host (line 1343) | def _parse_host(host: Optional[str]) -> str:
FILE: ollama/_types.py
class SubscriptableBaseModel (line 19) | class SubscriptableBaseModel(BaseModel):
method __getitem__ (line 20) | def __getitem__(self, key: str) -> Any:
method __setitem__ (line 35) | def __setitem__(self, key: str, value: Any) -> None:
method __contains__ (line 49) | def __contains__(self, key: str) -> bool:
method get (line 87) | def get(self, key: str, default: Any = None) -> Any:
class Options (line 104) | class Options(SubscriptableBaseModel):
class BaseRequest (line 140) | class BaseRequest(SubscriptableBaseModel):
class BaseStreamableRequest (line 145) | class BaseStreamableRequest(BaseRequest):
class BaseGenerateRequest (line 150) | class BaseGenerateRequest(BaseStreamableRequest):
class Image (line 161) | class Image(BaseModel):
method serialize_model (line 165) | def serialize_model(self):
class GenerateRequest (line 189) | class GenerateRequest(BaseGenerateRequest):
class BaseGenerateResponse (line 230) | class BaseGenerateResponse(SubscriptableBaseModel):
class TokenLogprob (line 262) | class TokenLogprob(SubscriptableBaseModel):
class Logprob (line 270) | class Logprob(TokenLogprob):
class GenerateResponse (line 275) | class GenerateResponse(BaseGenerateResponse):
class Message (line 304) | class Message(SubscriptableBaseModel):
class ToolCall (line 333) | class ToolCall(SubscriptableBaseModel):
class Function (line 338) | class Function(SubscriptableBaseModel):
class Tool (line 358) | class Tool(SubscriptableBaseModel):
class Function (line 361) | class Function(SubscriptableBaseModel):
class Parameters (line 365) | class Parameters(SubscriptableBaseModel):
class Property (line 372) | class Property(SubscriptableBaseModel):
class ChatRequest (line 387) | class ChatRequest(BaseGenerateRequest):
method serialize_model (line 389) | def serialize_model(self, nxt):
class ChatResponse (line 413) | class ChatResponse(BaseGenerateResponse):
class EmbedRequest (line 425) | class EmbedRequest(BaseRequest):
class EmbedResponse (line 441) | class EmbedResponse(BaseGenerateResponse):
class EmbeddingsRequest (line 450) | class EmbeddingsRequest(BaseRequest):
class EmbeddingsResponse (line 460) | class EmbeddingsResponse(SubscriptableBaseModel):
class PullRequest (line 469) | class PullRequest(BaseStreamableRequest):
class PushRequest (line 478) | class PushRequest(BaseStreamableRequest):
class CreateRequest (line 487) | class CreateRequest(BaseStreamableRequest):
method serialize_model (line 489) | def serialize_model(self, nxt):
class ModelDetails (line 509) | class ModelDetails(SubscriptableBaseModel):
class ListResponse (line 518) | class ListResponse(SubscriptableBaseModel):
class Model (line 519) | class Model(SubscriptableBaseModel):
class DeleteRequest (line 530) | class DeleteRequest(BaseRequest):
class CopyRequest (line 536) | class CopyRequest(BaseModel):
class StatusResponse (line 548) | class StatusResponse(SubscriptableBaseModel):
class ProgressResponse (line 552) | class ProgressResponse(StatusResponse):
class ShowRequest (line 558) | class ShowRequest(BaseRequest):
class ShowResponse (line 564) | class ShowResponse(SubscriptableBaseModel):
class ProcessResponse (line 582) | class ProcessResponse(SubscriptableBaseModel):
class Model (line 583) | class Model(SubscriptableBaseModel):
class WebSearchRequest (line 596) | class WebSearchRequest(SubscriptableBaseModel):
class WebSearchResult (line 601) | class WebSearchResult(SubscriptableBaseModel):
class WebFetchRequest (line 607) | class WebFetchRequest(SubscriptableBaseModel):
class WebSearchResponse (line 611) | class WebSearchResponse(SubscriptableBaseModel):
class WebFetchResponse (line 615) | class WebFetchResponse(SubscriptableBaseModel):
class RequestError (line 621) | class RequestError(Exception):
method __init__ (line 626) | def __init__(self, error: str):
class ResponseError (line 632) | class ResponseError(Exception):
method __init__ (line 637) | def __init__(self, error: str, status_code: int = -1):
method __str__ (line 650) | def __str__(self) -> str:
FILE: ollama/_utils.py
function _parse_docstring (line 13) | def _parse_docstring(doc_string: Union[str, None]) -> dict[str, str]:
function convert_function_to_tool (line 56) | def convert_function_to_tool(func: Callable) -> Tool:
FILE: tests/test_client.py
function anyio_backend (line 25) | def anyio_backend():
class PrefixPattern (line 29) | class PrefixPattern(URIPattern):
method __init__ (line 30) | def __init__(self, prefix: str):
method match (line 33) | def match(self, uri):
function test_client_chat (line 37) | def test_client_chat(httpserver: HTTPServer):
function test_client_chat_with_logprobs (line 64) | def test_client_chat_with_logprobs(httpserver: HTTPServer):
function test_client_chat_stream (line 102) | def test_client_chat_stream(httpserver: HTTPServer):
function test_client_chat_images (line 143) | def test_client_chat_images(httpserver: HTTPServer, message_format: str,...
function test_client_chat_format_json (line 193) | def test_client_chat_format_json(httpserver: HTTPServer):
function test_client_chat_format_pydantic (line 221) | def test_client_chat_format_pydantic(httpserver: HTTPServer):
function test_async_client_chat_format_json (line 253) | async def test_async_client_chat_format_json(httpserver: HTTPServer):
function test_async_client_chat_format_pydantic (line 281) | async def test_async_client_chat_format_pydantic(httpserver: HTTPServer):
function test_client_generate (line 313) | def test_client_generate(httpserver: HTTPServer):
function test_client_generate_with_logprobs (line 335) | def test_client_generate_with_logprobs(httpserver: HTTPServer):
function test_client_generate_with_image_type (line 369) | def test_client_generate_with_image_type(httpserver: HTTPServer):
function test_client_generate_with_invalid_image (line 392) | def test_client_generate_with_invalid_image(httpserver: HTTPServer):
function test_client_generate_stream (line 409) | def test_client_generate_stream(httpserver: HTTPServer):
function test_client_generate_images (line 444) | def test_client_generate_images(httpserver: HTTPServer):
function test_client_generate_format_json (line 471) | def test_client_generate_format_json(httpserver: HTTPServer):
function test_client_generate_format_pydantic (line 494) | def test_client_generate_format_pydantic(httpserver: HTTPServer):
function test_async_client_generate_format_json (line 521) | async def test_async_client_generate_format_json(httpserver: HTTPServer):
function test_async_client_generate_format_pydantic (line 544) | async def test_async_client_generate_format_pydantic(httpserver: HTTPSer...
function test_client_generate_image (line 571) | def test_client_generate_image(httpserver: HTTPServer):
function test_client_generate_image_stream (line 599) | def test_client_generate_image_stream(httpserver: HTTPServer):
function test_async_client_generate_image (line 655) | async def test_async_client_generate_image(httpserver: HTTPServer):
function test_client_pull (line 680) | def test_client_pull(httpserver: HTTPServer):
function test_client_pull_stream (line 696) | def test_client_pull_stream(httpserver: HTTPServer):
function test_client_push (line 725) | def test_client_push(httpserver: HTTPServer):
function test_client_push_stream (line 741) | def test_client_push_stream(httpserver: HTTPServer):
function userhomedir (line 769) | def userhomedir():
function test_client_create_with_blob (line 777) | def test_client_create_with_blob(httpserver: HTTPServer):
function test_client_create_with_parameters_roundtrip (line 795) | def test_client_create_with_parameters_roundtrip(httpserver: HTTPServer):
function test_client_create_from_library (line 831) | def test_client_create_from_library(httpserver: HTTPServer):
function test_client_create_blob (line 848) | def test_client_create_blob(httpserver: HTTPServer):
function test_client_create_blob_exists (line 858) | def test_client_create_blob_exists(httpserver: HTTPServer):
function test_client_delete (line 868) | def test_client_delete(httpserver: HTTPServer):
function test_client_copy (line 875) | def test_client_copy(httpserver: HTTPServer):
function test_async_client_chat (line 882) | async def test_async_client_chat(httpserver: HTTPServer):
function test_async_client_chat_stream (line 909) | async def test_async_client_chat_stream(httpserver: HTTPServer):
function test_async_client_chat_images (line 948) | async def test_async_client_chat_images(httpserver: HTTPServer):
function test_async_client_generate (line 982) | async def test_async_client_generate(httpserver: HTTPServer):
function test_async_client_generate_stream (line 1004) | async def test_async_client_generate_stream(httpserver: HTTPServer):
function test_async_client_generate_images (line 1039) | async def test_async_client_generate_images(httpserver: HTTPServer):
function test_async_client_pull (line 1066) | async def test_async_client_pull(httpserver: HTTPServer):
function test_async_client_pull_stream (line 1082) | async def test_async_client_pull_stream(httpserver: HTTPServer):
function test_async_client_push (line 1111) | async def test_async_client_push(httpserver: HTTPServer):
function test_async_client_push_stream (line 1127) | async def test_async_client_push_stream(httpserver: HTTPServer):
function test_async_client_create_with_blob (line 1154) | async def test_async_client_create_with_blob(httpserver: HTTPServer):
function test_async_client_create_with_parameters_roundtrip (line 1172) | async def test_async_client_create_with_parameters_roundtrip(httpserver:...
function test_async_client_create_from_library (line 1208) | async def test_async_client_create_from_library(httpserver: HTTPServer):
function test_async_client_create_blob (line 1225) | async def test_async_client_create_blob(httpserver: HTTPServer):
function test_async_client_create_blob_exists (line 1235) | async def test_async_client_create_blob_exists(httpserver: HTTPServer):
function test_async_client_delete (line 1245) | async def test_async_client_delete(httpserver: HTTPServer):
function test_async_client_copy (line 1252) | async def test_async_client_copy(httpserver: HTTPServer):
function test_headers (line 1259) | def test_headers():
function test_copy_tools (line 1275) | def test_copy_tools():
function test_tool_validation (line 1319) | def test_tool_validation():
function test_client_connection_error (line 1327) | def test_client_connection_error():
function test_async_client_connection_error (line 1340) | async def test_async_client_connection_error():
function test_arbitrary_roles_accepted_in_message (line 1353) | def test_arbitrary_roles_accepted_in_message():
function _mock_request (line 1357) | def _mock_request(*args: Any, **kwargs: Any) -> Response:
function test_arbitrary_roles_accepted_in_message_request (line 1361) | def test_arbitrary_roles_accepted_in_message_request(monkeypatch: pytest...
function _mock_request_async (line 1369) | async def _mock_request_async(*args: Any, **kwargs: Any) -> Response:
function test_arbitrary_roles_accepted_in_message_request_async (line 1373) | async def test_arbitrary_roles_accepted_in_message_request_async(monkeyp...
function test_client_web_search_requires_bearer_auth_header (line 1381) | def test_client_web_search_requires_bearer_auth_header(monkeypatch: pyte...
function test_client_web_fetch_requires_bearer_auth_header (line 1390) | def test_client_web_fetch_requires_bearer_auth_header(monkeypatch: pytes...
function _mock_request_web_search (line 1399) | def _mock_request_web_search(self, cls, method, url, json=None, **kwargs):
function _mock_request_web_fetch (line 1406) | def _mock_request_web_fetch(self, cls, method, url, json=None, **kwargs):
function test_client_web_search_with_env_api_key (line 1413) | def test_client_web_search_with_env_api_key(monkeypatch: pytest.MonkeyPa...
function test_client_web_fetch_with_env_api_key (line 1421) | def test_client_web_fetch_with_env_api_key(monkeypatch: pytest.MonkeyPat...
function test_client_web_search_with_explicit_bearer_header (line 1429) | def test_client_web_search_with_explicit_bearer_header(monkeypatch: pyte...
function test_client_web_fetch_with_explicit_bearer_header (line 1437) | def test_client_web_fetch_with_explicit_bearer_header(monkeypatch: pytes...
function test_client_bearer_header_from_env (line 1445) | def test_client_bearer_header_from_env(monkeypatch: pytest.MonkeyPatch):
function test_client_explicit_bearer_header_overrides_env (line 1452) | def test_client_explicit_bearer_header_overrides_env(monkeypatch: pytest...
function test_client_close (line 1461) | def test_client_close():
function test_async_client_close (line 1468) | async def test_async_client_close():
function test_client_context_manager (line 1474) | def test_client_context_manager():
function test_async_client_context_manager (line 1483) | async def test_async_client_context_manager():
FILE: tests/test_type_serialization.py
function test_image_serialization_bytes (line 10) | def test_image_serialization_bytes():
function test_image_serialization_base64_string (line 17) | def test_image_serialization_base64_string():
function test_image_serialization_long_base64_string (line 23) | def test_image_serialization_long_base64_string():
function test_image_serialization_plain_string (line 29) | def test_image_serialization_plain_string():
function test_image_serialization_path (line 34) | def test_image_serialization_path():
function test_image_serialization_string_path (line 42) | def test_image_serialization_string_path():
function test_create_request_serialization (line 58) | def test_create_request_serialization():
function test_create_request_serialization_exclude_none_true (line 73) | def test_create_request_serialization_exclude_none_true():
function test_create_request_serialization_exclude_none_false (line 82) | def test_create_request_serialization_exclude_none_false():
function test_create_request_serialization_license_list (line 91) | def test_create_request_serialization_license_list():
FILE: tests/test_utils.py
function test_function_to_tool_conversion (line 8) | def test_function_to_tool_conversion():
function test_function_with_no_args (line 31) | def test_function_with_no_args():
function test_function_with_all_types (line 47) | def test_function_with_all_types():
function test_function_docstring_parsing (line 116) | def test_function_docstring_parsing():
function test_skewed_docstring_parsing (line 140) | def test_skewed_docstring_parsing():
function test_function_with_no_docstring (line 160) | def test_function_with_no_docstring():
function test_function_with_only_description (line 174) | def test_function_with_only_description():
function test_function_with_yields (line 203) | def test_function_with_yields():
function test_function_with_no_types (line 222) | def test_function_with_no_types():
function test_function_with_parentheses (line 233) | def test_function_with_parentheses():
Condensed preview — 52 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (208K chars).
[
{
"path": ".github/dependabot.yml",
"chars": 191,
"preview": "version: 2\nupdates:\n - package-ecosystem: github-actions\n directory: /\n schedule:\n interval: daily\n - packa"
},
{
"path": ".github/workflows/publish.yaml",
"chars": 533,
"preview": "name: publish\n\non:\n release:\n types:\n - created\n\njobs:\n publish:\n runs-on: ubuntu-latest\n environment: r"
},
{
"path": ".github/workflows/test.yaml",
"chars": 872,
"preview": "name: test\n\non:\n push:\n branches:\n - main\n pull_request:\n\njobs:\n test:\n runs-on: ubuntu-latest\n steps:\n"
},
{
"path": ".gitignore",
"chars": 3078,
"preview": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packagi"
},
{
"path": "LICENSE",
"chars": 1058,
"preview": "MIT License\n\nCopyright (c) Ollama\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this "
},
{
"path": "README.md",
"chars": 5369,
"preview": "# Ollama Python Library\n\nThe Ollama Python library provides the easiest way to integrate Python 3.8+ projects with [Olla"
},
{
"path": "SECURITY.md",
"chars": 1005,
"preview": "# Security\n\nThe Ollama maintainer team takes security seriously and will actively work to resolve security issues.\n\n## R"
},
{
"path": "examples/README.md",
"chars": 3310,
"preview": "# Running Examples\n\nRun the examples in this directory with:\n\n```sh\n# Run example\npython3 examples/<example>.py\n\n# or wi"
},
{
"path": "examples/async-chat.py",
"chars": 339,
"preview": "import asyncio\n\nfrom ollama import AsyncClient\n\n\nasync def main():\n messages = [\n {\n 'role': 'user',\n 'con"
},
{
"path": "examples/async-generate.py",
"chars": 293,
"preview": "import asyncio\n\nimport ollama\n\n\nasync def main():\n client = ollama.AsyncClient()\n response = await client.generate('ge"
},
{
"path": "examples/async-structured-outputs.py",
"chars": 929,
"preview": "import asyncio\n\nfrom pydantic import BaseModel\n\nfrom ollama import AsyncClient\n\n\n# Define the schema for the response\ncl"
},
{
"path": "examples/async-tools.py",
"chars": 2467,
"preview": "import asyncio\n\nimport ollama\nfrom ollama import ChatResponse\n\n\ndef add_two_numbers(a: int, b: int) -> int:\n \"\"\"\n Add "
},
{
"path": "examples/chat-logprobs.py",
"chars": 737,
"preview": "from typing import Iterable\n\nimport ollama\n\n\ndef print_logprobs(logprobs: Iterable[dict], label: str) -> None:\n print(f"
},
{
"path": "examples/chat-stream.py",
"chars": 225,
"preview": "from ollama import chat\n\nmessages = [\n {\n 'role': 'user',\n 'content': 'Why is the sky blue?',\n },\n]\n\nfor part in"
},
{
"path": "examples/chat-with-history.py",
"chars": 1232,
"preview": "from ollama import chat\n\nmessages = [\n {\n 'role': 'user',\n 'content': 'Why is the sky blue?',\n },\n {\n 'role'"
},
{
"path": "examples/chat.py",
"chars": 192,
"preview": "from ollama import chat\n\nmessages = [\n {\n 'role': 'user',\n 'content': 'Why is the sky blue?',\n },\n]\n\nresponse = "
},
{
"path": "examples/create.py",
"chars": 203,
"preview": "from ollama import Client\n\nclient = Client()\nresponse = client.create(\n model='my-assistant',\n from_='gemma3',\n syste"
},
{
"path": "examples/embed.py",
"chars": 114,
"preview": "from ollama import embed\n\nresponse = embed(model='llama3.2', input='Hello, world!')\nprint(response['embeddings'])\n"
},
{
"path": "examples/fill-in-middle.py",
"chars": 346,
"preview": "from ollama import generate\n\nprompt = '''def remove_non_ascii(s: str) -> str:\n \"\"\" '''\n\nsuffix = \"\"\"\n return resul"
},
{
"path": "examples/generate-image.py",
"chars": 572,
"preview": "# Image generation is experimental and currently only available on macOS\n\nimport base64\n\nfrom ollama import generate\n\npr"
},
{
"path": "examples/generate-logprobs.py",
"chars": 667,
"preview": "from typing import Iterable\n\nimport ollama\n\n\ndef print_logprobs(logprobs: Iterable[dict], label: str) -> None:\n print(f"
},
{
"path": "examples/generate-stream.py",
"chars": 144,
"preview": "from ollama import generate\n\nfor part in generate('gemma3', 'Why is the sky blue?', stream=True):\n print(part['response"
},
{
"path": "examples/generate.py",
"chars": 111,
"preview": "from ollama import generate\n\nresponse = generate('gemma3', 'Why is the sky blue?')\nprint(response['response'])\n"
},
{
"path": "examples/gpt-oss-tools-stream.py",
"chars": 3036,
"preview": "# /// script\n# requires-python = \">=3.11\"\n# dependencies = [\n# \"gpt-oss\",\n# \"ollama\",\n# \"rich\",\n# ]\n# ///\nim"
},
{
"path": "examples/gpt-oss-tools.py",
"chars": 2426,
"preview": "# /// script\n# requires-python = \">=3.11\"\n# dependencies = [\n# \"gpt-oss\",\n# \"ollama\",\n# \"rich\",\n# ]\n# ///\nim"
},
{
"path": "examples/list.py",
"chars": 452,
"preview": "from ollama import ListResponse, list\n\nresponse: ListResponse = list()\n\nfor model in response.models:\n print('Name:', m"
},
{
"path": "examples/multi-tool.py",
"chars": 3042,
"preview": "import random\nfrom typing import Iterator\n\nfrom ollama import ChatResponse, Client\n\n\ndef get_temperature(city: str) -> i"
},
{
"path": "examples/multimodal-chat.py",
"chars": 500,
"preview": "from ollama import chat\n\n# from pathlib import Path\n\n# Pass in the path to the image\npath = input('Please enter the path"
},
{
"path": "examples/multimodal-generate.py",
"chars": 663,
"preview": "import random\nimport sys\n\nimport httpx\n\nfrom ollama import generate\n\nlatest = httpx.get('https://xkcd.com/info.0.json')\n"
},
{
"path": "examples/ps.py",
"chars": 813,
"preview": "from ollama import ProcessResponse, chat, ps, pull\n\n# Ensure at least one model is loaded\nresponse = pull('gemma3', stre"
},
{
"path": "examples/pull.py",
"chars": 601,
"preview": "from tqdm import tqdm\n\nfrom ollama import pull\n\ncurrent_digest, bars = '', {}\nfor progress in pull('gemma3', stream=True"
},
{
"path": "examples/show.py",
"chars": 476,
"preview": "from ollama import ShowResponse, show\n\nresponse: ShowResponse = show('gemma3')\nprint('Model Information:')\nprint(f'Modif"
},
{
"path": "examples/structured-outputs-image.py",
"chars": 1338,
"preview": "from pathlib import Path\nfrom typing import Literal\n\nfrom pydantic import BaseModel\n\nfrom ollama import chat\n\n\n# Define "
},
{
"path": "examples/structured-outputs.py",
"chars": 1082,
"preview": "from pydantic import BaseModel\n\nfrom ollama import chat\n\n\n# Define the schema for the response\nclass FriendInfo(BaseMode"
},
{
"path": "examples/thinking-generate.py",
"chars": 208,
"preview": "from ollama import generate\n\nresponse = generate('deepseek-r1', 'why is the sky blue', think=True)\n\nprint('Thinking:\\n=="
},
{
"path": "examples/thinking-levels.py",
"chars": 546,
"preview": "from ollama import chat\n\n\ndef heading(text):\n print(text)\n print('=' * len(text))\n\n\nmessages = [\n {'role': 'user', 'c"
},
{
"path": "examples/thinking.py",
"chars": 291,
"preview": "from ollama import chat\n\nmessages = [\n {\n 'role': 'user',\n 'content': 'What is 10 + 23?',\n },\n]\n\nresponse = chat"
},
{
"path": "examples/tools.py",
"chars": 2495,
"preview": "from ollama import ChatResponse, chat\n\n\ndef add_two_numbers(a: int, b: int) -> int:\n \"\"\"\n Add two numbers\n\n Args:\n "
},
{
"path": "examples/web-search-gpt-oss.py",
"chars": 2526,
"preview": "# /// script\n# requires-python = \">=3.11\"\n# dependencies = [\n# \"ollama\",\n# ]\n# ///\nfrom typing import Any, Dict, Lis"
},
{
"path": "examples/web-search-mcp.py",
"chars": 2854,
"preview": "# /// script\n# requires-python = \">=3.11\"\n# dependencies = [\n# \"mcp\",\n# \"rich\",\n# \"ollama\",\n# ]\n# ///\n\"\"\"\nMCP stdi"
},
{
"path": "examples/web-search.py",
"chars": 2872,
"preview": "# /// script\n# requires-python = \">=3.11\"\n# dependencies = [\n# \"rich\",\n# \"ollama\",\n# ]\n# ///\nfrom typing import "
},
{
"path": "examples/web_search_gpt_oss_helper.py",
"chars": 15595,
"preview": "from __future__ import annotations\n\nimport re\nfrom dataclasses import dataclass, field\nfrom datetime import datetime\nfro"
},
{
"path": "ollama/__init__.py",
"chars": 1058,
"preview": "from ollama._client import AsyncClient, Client\nfrom ollama._types import (\n ChatResponse,\n EmbeddingsResponse,\n Embed"
},
{
"path": "ollama/_client.py",
"chars": 40068,
"preview": "import contextlib\nimport ipaddress\nimport json\nimport os\nimport platform\nimport sys\nimport urllib.parse\nfrom hashlib imp"
},
{
"path": "ollama/_types.py",
"chars": 17243,
"preview": "import contextlib\nimport json\nfrom base64 import b64decode, b64encode\nfrom datetime import datetime\nfrom pathlib import "
},
{
"path": "ollama/_utils.py",
"chars": 2732,
"preview": "from __future__ import annotations\n\nimport inspect\nimport re\nfrom collections import defaultdict\nfrom typing import Call"
},
{
"path": "ollama/py.typed",
"chars": 0,
"preview": ""
},
{
"path": "pyproject.toml",
"chars": 1464,
"preview": "[project]\nname = 'ollama'\ndescription = 'The official Python client for Ollama.'\nauthors = [\n { email = 'hello@ollama"
},
{
"path": "requirements.txt",
"chars": 11275,
"preview": "# This file was autogenerated by uv via the following command:\n# uv export\n-e .\nannotated-types==0.7.0 \\\n --hash=s"
},
{
"path": "tests/test_client.py",
"chars": 45611,
"preview": "import base64\nimport json\nimport os\nimport re\nimport tempfile\nfrom pathlib import Path\nfrom typing import Any\n\nimport py"
},
{
"path": "tests/test_type_serialization.py",
"chars": 3291,
"preview": "import tempfile\nfrom base64 import b64encode\nfrom pathlib import Path\n\nimport pytest\n\nfrom ollama._types import CreateRe"
},
{
"path": "tests/test_utils.py",
"chars": 8922,
"preview": "import json\nimport sys\nfrom typing import Dict, List, Mapping, Sequence, Set, Tuple, Union\n\nfrom ollama._utils import co"
}
]
About this extraction
This page contains the full source code of the ollama/ollama-python GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 52 files (192.8 KB), approximately 54.5k tokens, and a symbol index with 293 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.