Full Code of langchain-ai/opengpts for AI

main 8e7fef72e1f7 cached

132 files

1.9 MB

419.6k tokens

662 symbols

1 requests

Download .txt

Showing preview only (2,017K chars total). Download the full file or copy to clipboard to get everything.

Repository: langchain-ai/opengpts
Branch: main
Commit: 8e7fef72e1f7
Files: 132
Total size: 1.9 MB

Directory structure:
gitextract_57j5153_/

├── .github/
│   ├── actions/
│   │   └── poetry_setup/
│   │       └── action.yml
│   └── workflows/
│       ├── _lint.yml
│       ├── build_deploy_image.yml
│       └── ci.yml
├── .gitignore
├── API.md
├── CONTRIBUTING.md
├── Dockerfile
├── LICENSE
├── README.md
├── auth.md
├── backend/
│   ├── .gitignore
│   ├── Dockerfile
│   ├── Makefile
│   ├── README.md
│   ├── app/
│   │   ├── __init__.py
│   │   ├── agent.py
│   │   ├── agent_types/
│   │   │   ├── __init__.py
│   │   │   ├── prompts.py
│   │   │   ├── tools_agent.py
│   │   │   └── xml_agent.py
│   │   ├── api/
│   │   │   ├── __init__.py
│   │   │   ├── assistants.py
│   │   │   ├── runs.py
│   │   │   └── threads.py
│   │   ├── auth/
│   │   │   ├── __init__.py
│   │   │   ├── handlers.py
│   │   │   └── settings.py
│   │   ├── chatbot.py
│   │   ├── checkpoint.py
│   │   ├── ingest.py
│   │   ├── lifespan.py
│   │   ├── llms.py
│   │   ├── message_types.py
│   │   ├── parsing.py
│   │   ├── retrieval.py
│   │   ├── schema.py
│   │   ├── server.py
│   │   ├── storage.py
│   │   ├── stream.py
│   │   ├── tools.py
│   │   └── upload.py
│   ├── log_config.json
│   ├── migrations/
│   │   ├── 000001_create_extensions_and_first_tables.down.sql
│   │   ├── 000001_create_extensions_and_first_tables.up.sql
│   │   ├── 000002_checkpoints_update_schema.down.sql
│   │   ├── 000002_checkpoints_update_schema.up.sql
│   │   ├── 000003_create_user.down.sql
│   │   ├── 000003_create_user.up.sql
│   │   ├── 000004_add_metadata_to_thread.down.sql
│   │   ├── 000004_add_metadata_to_thread.up.sql
│   │   ├── 000005_advanced_checkpoints_schema.down.sql
│   │   └── 000005_advanced_checkpoints_schema.up.sql
│   ├── pyproject.toml
│   └── tests/
│       ├── __init__.py
│       └── unit_tests/
│           ├── __init__.py
│           ├── agent_executor/
│           │   ├── __init__.py
│           │   ├── test_parsing.py
│           │   └── test_upload.py
│           ├── app/
│           │   ├── __init__.py
│           │   ├── helpers.py
│           │   ├── test_app.py
│           │   └── test_auth.py
│           ├── conftest.py
│           ├── fixtures/
│           │   ├── __init__.py
│           │   ├── sample.docx
│           │   ├── sample.epub
│           │   ├── sample.html
│           │   ├── sample.odt
│           │   ├── sample.rtf
│           │   └── sample.txt
│           ├── test_imports.py
│           └── utils.py
├── docker-compose-prod.yml
├── docker-compose.yml
├── frontend/
│   ├── .eslintrc.cjs
│   ├── .gitignore
│   ├── Dockerfile
│   ├── README.md
│   ├── index.html
│   ├── package.json
│   ├── postcss.config.js
│   ├── src/
│   │   ├── App.tsx
│   │   ├── api/
│   │   │   ├── assistants.ts
│   │   │   └── threads.ts
│   │   ├── components/
│   │   │   ├── Chat.tsx
│   │   │   ├── ChatList.tsx
│   │   │   ├── Config.tsx
│   │   │   ├── ConfigList.tsx
│   │   │   ├── Document.tsx
│   │   │   ├── FileUpload.tsx
│   │   │   ├── JsonEditor.tsx
│   │   │   ├── LangSmithActions.tsx
│   │   │   ├── Layout.tsx
│   │   │   ├── Message.tsx
│   │   │   ├── MessageEditor.tsx
│   │   │   ├── NewChat.tsx
│   │   │   ├── NotFound.tsx
│   │   │   ├── OrphanChat.tsx
│   │   │   ├── String.tsx
│   │   │   ├── StringEditor.tsx
│   │   │   ├── Tool.tsx
│   │   │   └── TypingBox.tsx
│   │   ├── constants.ts
│   │   ├── hooks/
│   │   │   ├── useChatList.ts
│   │   │   ├── useChatMessages.ts
│   │   │   ├── useConfigList.ts
│   │   │   ├── useMessageEditing.ts
│   │   │   ├── useSchemas.ts
│   │   │   ├── useStatePersist.tsx
│   │   │   ├── useStreamState.tsx
│   │   │   ├── useThreadAndAssistant.ts
│   │   │   └── useToolsSchemas.ts
│   │   ├── index.css
│   │   ├── main.tsx
│   │   ├── types.ts
│   │   ├── utils/
│   │   │   ├── cn.ts
│   │   │   ├── defaults.ts
│   │   │   ├── formTypes.ts
│   │   │   ├── json-refs.d.ts
│   │   │   ├── json-refs.js
│   │   │   ├── simplifySchema.ts
│   │   │   └── str.ts
│   │   └── vite-env.d.ts
│   ├── tailwind.config.js
│   ├── tsconfig.json
│   ├── tsconfig.node.json
│   └── vite.config.ts
└── tools/
    └── redis_to_postgres/
        ├── Dockerfile
        ├── README.md
        ├── docker-compose.yml
        └── migrate_data.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .github/actions/poetry_setup/action.yml
================================================
# An action for setting up poetry install with caching.
# Using a custom action since the default action does not
# take poetry install groups into account.
# Action code from:
# https://github.com/actions/setup-python/issues/505#issuecomment-1273013236
name: poetry-install-with-caching
description: Poetry install with support for caching of dependency groups.

inputs:
  python-version:
    description: Python version, supporting MAJOR.MINOR only
    required: true

  poetry-version:
    description: Poetry version
    required: true

  cache-key:
    description: Cache key to use for manual handling of caching
    required: true

  working-directory:
    description: Directory whose poetry.lock file should be cached
    required: true

runs:
  using: composite
  steps:
    - uses: actions/setup-python@v4
      name: Setup python ${{ inputs.python-version }}
      with:
        python-version: ${{ inputs.python-version }}

    - uses: actions/cache@v3
      id: cache-bin-poetry
      name: Cache Poetry binary - Python ${{ inputs.python-version }}
      env:
        SEGMENT_DOWNLOAD_TIMEOUT_MIN: "1"
      with:
        path: |
          /opt/pipx/venvs/poetry
        # This step caches the poetry installation, so make sure it's keyed on the poetry version as well.
        key: bin-poetry-${{ runner.os }}-${{ runner.arch }}-py-${{ inputs.python-version }}-${{ inputs.poetry-version }}

    - name: Refresh shell hashtable and fixup softlinks
      if: steps.cache-bin-poetry.outputs.cache-hit == 'true'
      shell: bash
      env:
        POETRY_VERSION: ${{ inputs.poetry-version }}
        PYTHON_VERSION: ${{ inputs.python-version }}
      run: |
        set -eux

        # Refresh the shell hashtable, to ensure correct `which` output.
        hash -r

        # `actions/cache@v3` doesn't always seem able to correctly unpack softlinks.
        # Delete and recreate the softlinks pipx expects to have.
        rm /opt/pipx/venvs/poetry/bin/python
        cd /opt/pipx/venvs/poetry/bin
        ln -s "$(which "python$PYTHON_VERSION")" python
        chmod +x python
        cd /opt/pipx_bin/
        ln -s /opt/pipx/venvs/poetry/bin/poetry poetry
        chmod +x poetry

        # Ensure everything got set up correctly.
        /opt/pipx/venvs/poetry/bin/python --version
        /opt/pipx_bin/poetry --version

    - name: Install poetry
      if: steps.cache-bin-poetry.outputs.cache-hit != 'true'
      shell: bash
      env:
        POETRY_VERSION: ${{ inputs.poetry-version }}
        PYTHON_VERSION: ${{ inputs.python-version }}
      run: pipx install "poetry==$POETRY_VERSION" --python "python$PYTHON_VERSION" --verbose

    - name: Restore pip and poetry cached dependencies
      uses: actions/cache@v3
      env:
        SEGMENT_DOWNLOAD_TIMEOUT_MIN: "4"
        WORKDIR: ${{ inputs.working-directory == '' && '.' || inputs.working-directory }}
      with:
        path: |
          ~/.cache/pip
          ~/.cache/pypoetry/virtualenvs
          ~/.cache/pypoetry/cache
          ~/.cache/pypoetry/artifacts
          ${{ env.WORKDIR }}/.venv
        key: py-deps-${{ runner.os }}-${{ runner.arch }}-py-${{ inputs.python-version }}-poetry-${{ inputs.poetry-version }}-${{ inputs.cache-key }}-${{ hashFiles(format('{0}/**/poetry.lock', env.WORKDIR)) }}


================================================
FILE: .github/workflows/_lint.yml
================================================
name: lint

on:
  workflow_call:
    inputs:
      working-directory:
        required: true
        type: string
        description: "From which folder this pipeline executes"

env:
  POETRY_VERSION: "1.5.1"
  WORKDIR: ${{ inputs.working-directory == '' && '.' || inputs.working-directory }}

jobs:
  build:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        # Only lint on the min and max supported Python versions.
        # It's extremely unlikely that there's a lint issue on any version in between
        # that doesn't show up on the min or max versions.
        #
        # GitHub rate-limits how many jobs can be running at any one time.
        # Starting new jobs is also relatively slow,
        # so linting on fewer versions makes CI faster.
        python-version:
          - "3.9"
          - "3.11"
    steps:
      - uses: actions/checkout@v3
      - name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
        uses: "./.github/actions/poetry_setup"
        with:
          python-version: ${{ matrix.python-version }}
          poetry-version: ${{ env.POETRY_VERSION }}
          working-directory: ${{ inputs.working-directory }}
          cache-key: lint-with-extras

      - name: Check Poetry File
        shell: bash
        working-directory: ${{ inputs.working-directory }}
        run: |
          poetry check

      - name: Check lock file
        shell: bash
        working-directory: ${{ inputs.working-directory }}
        run: |
          poetry lock --check

      - name: Install dependencies
        # Also installs dev/lint/test/typing dependencies, to ensure we have
        # type hints for as many of our libraries as possible.
        # This helps catch errors that require dependencies to be spotted, for example:
        # https://github.com/langchain-ai/langchain/pull/10249/files#diff-935185cd488d015f026dcd9e19616ff62863e8cde8c0bee70318d3ccbca98341
        #
        # If you change this configuration, make sure to change the `cache-key`
        # in the `poetry_setup` action above to stop using the old cache.
        # It doesn't matter how you change it, any change will cause a cache-bust.
        working-directory: ${{ inputs.working-directory }}
        run: |
          poetry install --with dev,lint,test
        # Add typing dependencies once we roll out mypy
        # poetry install --with dev,lint,test,typing

      - name: Analysing the code with our lint
        working-directory: ${{ inputs.working-directory }}
        env:
          BLACK_CACHE_DIR: .black_cache
        run: |
          make lint


================================================
FILE: .github/workflows/build_deploy_image.yml
================================================
name: Build, Push, and Deploy Open GPTS

on:
  push:
    branches: [main]
  workflow_dispatch:

jobs:
  build-and-push:
    runs-on: ubuntu-latest

    steps:
    - name: Checkout
      uses: actions/checkout@v3

    - name: Set up Short Hash
      run: |
        echo "GIT_SHORT_SHA=$(git rev-parse --short HEAD)" >> $GITHUB_ENV

    - name: Set up depot.dev multi-arch runner
      uses: depot/setup-action@v1

    - name: Login to DockerHub
      uses: docker/login-action@v2
      with:
        username: ${{ secrets.LANGCHAIN_DOCKERHUB_USERNAME }}
        password: ${{ secrets.LANGCHAIN_DOCKERHUB_PASSWORD }}

    - name: Set up Docker Buildx
      uses: docker/setup-buildx-action@v3

    - name: Build and push
      uses: docker/build-push-action@v5
      with:
        push: true
        platforms: linux/amd64,linux/arm64
        tags: "docker.io/langchain/open-gpts:${{ env.GIT_SHORT_SHA }}, docker.io/langchain/open-gpts:latest"


================================================
FILE: .github/workflows/ci.yml
================================================
---
name: CI

on:
  push:
    branches: [main]
  pull_request: # Trigger on all PRs, ensuring required actions to be run.
  workflow_dispatch: # Allows to trigger the workflow manually in GitHub UI

# If another push to the same PR or branch happens while this workflow is still running,
# cancel the earlier run in favor of the next run.
#
# There's no point in testing an outdated version of the code. GitHub only allows
# a limited number of job runners to be active at the same time, so it's better to cancel
# pointless jobs early so that more useful jobs can run sooner.
concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

env:
  POETRY_VERSION: "1.5.1"
  WORKDIR: "./backend"

jobs:
  lint:
    uses: ./.github/workflows/_lint.yml
    with:
      working-directory: "./backend"
    secrets: inherit

  test:
    timeout-minutes: 5
    runs-on: ubuntu-latest
    defaults:
      run:
        working-directory: ${{ env.WORKDIR }}
    strategy:
      matrix:
        python-version:
          - "3.9"
          - "3.10"
          - "3.11"
    name: Python ${{ matrix.python-version }} tests
    services:
      # Label used to access the service container
      postgres:
        image: pgvector/pgvector:pg16
        env:
          POSTGRES_USER: postgres
          POSTGRES_PASSWORD: postgres
          POSTGRES_DB: postgres
        # Set health checks to wait until postgres has started
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
        ports:
          - "5432:5432"
    steps:
      - uses: actions/checkout@v3
      - name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
        uses: "./.github/actions/poetry_setup"
        with:
          python-version: ${{ matrix.python-version }}
          poetry-version: ${{ env.POETRY_VERSION }}
          working-directory: .
          cache-key: langserve-all
      - name: Install dependencies
        run: |
          poetry install --with test
      - name: Install golang-migrate
        run: |
          wget -O golang-migrate.deb https://github.com/golang-migrate/migrate/releases/download/v4.17.0/migrate.linux-amd64.deb
          sudo dpkg -i golang-migrate.deb && rm golang-migrate.deb
      - name: Run tests
        env:
          POSTGRES_HOST: localhost
          POSTGRES_PORT: 5432
          POSTGRES_DB: postgres
          POSTGRES_USER: postgres
          POSTGRES_PASSWORD: postgres
          SCARF_NO_ANALYTICS: true
        run: make test

  frontend-lint-and-build:
      runs-on: ubuntu-latest
      needs: [lint, test]
      steps:
        - uses: actions/checkout@v3
        - name: Setup Node.js (LTS)
          uses: actions/setup-node@v3
          with:
            node-version: '20'
            cache: 'yarn'
            cache-dependency-path: frontend/yarn.lock
        - name: Install frontend dependencies
          run: yarn install
          working-directory: ./frontend
        - name: Run frontend lint
          run: yarn lint
          working-directory: ./frontend
        - name: Build frontend
          run: yarn build
          working-directory: ./frontend


================================================
FILE: .gitignore
================================================
*.env
.env.gcp.yaml
postgres-volume/
redis-volume/
backend/ui

# Operating System generated files
.DS_Store
Thumbs.db
ehthumbs.db
Desktop.ini

# Python artifacts
__pycache__/
*.py[cod]
.venv/
*.egg-info/
dist/

# Node.js / frontend artifacts
node_modules/
/dist
/dist-ssr
.npm
.npmrc
.yarn-cache
.yarn-integrity
.yarn.lock
package-lock.json
.pnpm-lock.yaml

# IDEs and editors
.vscode/*
!.vscode/extensions.json  # Include recommended extensions for VS Code users
.idea/
*.sublime-*
*.sublime-workspace
*.atom/
*.iml

# Microsoft Visual Studio
*.suo
*.ntvs*
*.njsproj
*.sln

# Swap and Temporary Files
*.swp
*.swo
*~
*.bak
*.tmp
*.temp

# Log files
*.log*
logs/
npm-debug.log*
yarn-debug.log*
yarn-error.log*
pnpm-debug.log*


================================================
FILE: API.md
================================================
# API Getting Started

This documentation covers how to get started with the API that backs OpenGPTs.
This allows you to easily integrate it with a different frontend of your choice.

For full API documentation, see [localhost:8100/docs](localhost:8100/docs) after deployment.

If you want to see the API docs before deployment, check out the [hosted docs here](https://opengpts-example-vz4y4ooboq-uc.a.run.app/docs).

In the examples below, cookies are used as a mock auth method. For production, we recommend using JWT auth. Refer to the [auth guide for production](auth.md) for more information.
When using JWT auth, you will need to include the JWT in the `Authorization` header as a Bearer token.

## Create an Assistant

First, let's use the API to create an assistant. 
This should look something like:

```python
import requests
requests.post('http://127.0.0.1:8100/assistants', json={
  "name": "bar",
  "config": {"configurable": {}},
  "public": True
}, cookies= {"opengpts_user_id": "foo"}).content
```
This is creating an assistant with name `"bar"`, with default configuration, that is public, and is associated with user `"foo"`.

This should return something like:

```shell
b'{"assistant_id":"9c7d7e6e-654b-4eaa-b160-f19f922fc63b","name":"string","config":{"configurable":{}},"updated_at":"2023-11-20T16:24:30.520340","public":true,"user_id":"foo"}'
```

The config parameters allows you to set the LLM used, the instructions of the assistant and also the tools used.


```
{
  "name": "bar",
  "config": {
    "configurable": {
      "type": "agent",
      "type==agent/agent_type": "GPT 3.5 Turbo",
      "type==agent/system_message": "You are a helpful assistant",
      "type==agent/tools": ["Wikipedia"]
  },
  "public": True
}
```
This creates an assistant with the name `"bar"`, with GPT 3.5 Turbo, with a prompt `"You are a helpful assistant"` using the Wikipedia tool , that is public.

Available tools names can be found in the AvailableTools class in backend/packages/gizmo-agent/gizmo_agent/tools.py
Available llms can be found in GizmoAgentType in backend/packages/gizmo-agent/gizmo_agent/agent_types/__init__.py

Note: If a RAGBot assistant is created (`type` equals `chat_retrieval`), then subsequent API requests/responses for the threads APIs are slightly modified and noted below.

## Create a thread

We can now create a thread.
Notably different from OpenAI's assistant API, we require starting the thread with an assistant ID.

```python
import requests
requests.post('http://127.0.0.1:8100/threads', cookies= {"opengpts_user_id": "foo"}, json={
    "name": "hi",
    "assistant_id": "9c7d7e6e-654b-4eaa-b160-f19f922fc63b"
}).content
```

This is creating a thread, named `"hi"`, with the assistant ID that we just created, for the same user.

This should return something like:

```shell
b'{"thread_id":"231dc7f3-33ee-4040-98fe-27f6e2aa8b2b","assistant_id":"9c7d7e6e-654b-4eaa-b160-f19f922fc63b","name":"hi","updated_at":"2023-11-20T16:26:39.083817","user_id":"foo"}'
```

## Add a message

We can check the thread, and see that it is currently empty:

```python
import requests
requests.get(
    'http://127.0.0.1:8100/threads/231dc7f3-33ee-4040-98fe-27f6e2aa8b2b/state', 
    cookies= {"opengpts_user_id": "foo"}
).content
```
```shell
b'{"values":[]}'
```
For RAGBot:
```shell
b'{"values":{"messages":[]}}'
```

Let's add a message to the thread!

```python
import requests
requests.post(
    'http://127.0.0.1:8100/threads/231dc7f3-33ee-4040-98fe-27f6e2aa8b2b/state', 
    cookies= {"opengpts_user_id": "foo"}, json={
        "values": [{
            "content": "hi! my name is bob",
            "type": "human",
        }]
    }
).content
```
For RAGBot:
```
{
    "values": {
        "messages": [{
            "content": "hi! my name is bob",
            "type": "human",
        }]
    }
}
```

If we now run the command to see the thread, we can see that there is now a message on that thread

```python
import requests
requests.get(
    'http://127.0.0.1:8100/threads/231dc7f3-33ee-4040-98fe-27f6e2aa8b2b/state', 
    cookies= {"opengpts_user_id": "foo"}
).content
```
```shell
b'{"values":[{"content":"hi! my name is bob","additional_kwargs":{},"type":"human","example":false}],"next":[]}'
```
For RAGBot:
```shell
b'{"values":{"messages":[...]},"next":[]}'
```

## Run the assistant on that thread

We can now run the assistant on that thread.

```python
import requests
requests.post('http://127.0.0.1:8100/runs', cookies= {"opengpts_user_id": "foo"}, json={
    "assistant_id": "9c7d7e6e-654b-4eaa-b160-f19f922fc63b",
    "thread_id": "231dc7f3-33ee-4040-98fe-27f6e2aa8b2b",
    "input": {
        "messages": []
    }
}).content
```
This runs the thread with the same id that we just created, with the assistant that we created, with no additional input messages (see below for how to add input messages).

If we now check the thread, we can see (after a bit) that there is a message from the AI.

```python
import requests
requests.get('http://127.0.0.1:8100/threads/231dc7f3-33ee-4040-98fe-27f6e2aa8b2b/state', cookies= {"opengpts_user_id": "foo"}).content
```
```shell
b'{"values":[{"content":"hi! my name is bob","additional_kwargs":{},"type":"human","example":false},{"content":"Hello, Bob! How can I assist you today?","additional_kwargs":{"agent":{"return_values":{"output":"Hello, Bob! How can I assist you today?"},"log":"Hello, Bob! How can I assist you today?","type":"AgentFinish"}},"type":"ai","example":false}],"next":[]}'
```
For RAGBot:
```shell
b'{"values":{"messages":[...]},"next":[]}'
```

## Run the assistant on the thread with new messages

We can also run the assistant on a thread and add new messages at the same time.
Continuing the example above, we can run:

```python
import requests
requests.post('http://127.0.0.1:8100/runs', cookies= {"opengpts_user_id": "foo"}, json={
    "assistant_id": "9c7d7e6e-654b-4eaa-b160-f19f922fc63b",
    "thread_id": "231dc7f3-33ee-4040-98fe-27f6e2aa8b2b",
    "input": {
        "messages": [{
            "content": "whats my name? respond in spanish",
            "type": "human",
        }]
    }
}).content
```

Then, if we call the threads endpoint after a bit we can see the human message - as well as an AI message - get added to the thread.

```python
import requests
requests.get('http://127.0.0.1:8100/threads/231dc7f3-33ee-4040-98fe-27f6e2aa8b2b/state', cookies= {"opengpts_user_id": "foo"}).content
```

```shell
b'{"values":[{"content":"hi! my name is bob","additional_kwargs":{},"type":"human","example":false},{"content":"Hello, Bob! How can I assist you today?","additional_kwargs":{"agent":{"return_values":{"output":"Hello, Bob! How can I assist you today?"},"log":"Hello, Bob! How can I assist you today?","type":"AgentFinish"}},"type":"ai","example":false},{"content":"whats my name? respond in spanish","additional_kwargs":{},"type":"human","example":false},{"content":"Tu nombre es Bob.","additional_kwargs":{"agent":{"return_values":{"output":"Tu nombre es Bob."},"log":"Tu nombre es Bob.","type":"AgentFinish"}},"type":"ai","example":false}],"next":[]}'
```
For RAGBot:
```shell
b'{"values":{"messages":[...]},"next":[]}'
```

## Stream
One thing we can do is stream back responses.
This works for both messages as well as tokens.
Below is an example of streaming back tokens for a response.

```python
import requests
import json
response = requests.post(
    'http://127.0.0.1:8100/runs/stream', 
    cookies= {"opengpts_user_id": "foo"}, json={
    "assistant_id": "9c7d7e6e-654b-4eaa-b160-f19f922fc63b",
    "thread_id": "231dc7f3-33ee-4040-98fe-27f6e2aa8b2b",
    "input": {
        "messages": [{
            "content": "have a good day!",
            "type": "human",
        }]
    }
})
res = []
if response.status_code == 200:
    # Iterate over the response
    for line in response.iter_lines():
        if line:  # filter out keep-alive new lines
            string_line = line.decode("utf-8")
            # Only look at where data i returned
            if string_line.startswith('data'):
                json_string = string_line[len('data: '):]
                # Get the json response - contains a list of all messages
                json_value = json.loads(json_string)
                if "messages" in json_value:
                    # Get the content from the last message
                    # If you want to display multiple messages (eg if agent takes intermediate steps) you will need to change this logic
                    print(json_value['messages'][-1]['content'])
else:
    print(f"Failed to retrieve data: {response.status_code}")
```

This streams the following:

```shell
You
You too
You too!
You too! If
You too! If you
You too! If you have
You too! If you have any
You too! If you have any other
You too! If you have any other questions
You too! If you have any other questions,
You too! If you have any other questions, feel
You too! If you have any other questions, feel free
You too! If you have any other questions, feel free to
You too! If you have any other questions, feel free to ask
You too! If you have any other questions, feel free to ask.
You too! If you have any other questions, feel free to ask.
You too! If you have any other questions, feel free to ask.
```


================================================
FILE: CONTRIBUTING.md
================================================
# Contributing

## Contributor License Agreement

We are grateful to the contributors who help evolve OpenGPTs and dedicate their time to the project. As the primary sponsor of OpenGPTs, LangChain, Inc. aims to build products in the open that benefit thousands of developers while allowing us to build a sustainable business. For all code contributions to OpenGPTs, we ask that contributors complete and sign a Contributor License Agreement (“CLA”). The agreement between contributors and the project is explicit, so OpenGPTs users can be confident in the legal status of the source code and their right to use it.The CLA does not change the terms of the underlying license, OpenGPTs License, used by our software.

Before you can contribute to OpenGPTs, a bot will comment on the PR asking you to agree to the CLA if you haven't already. Agreeing to the CLA is required before code can be merged and only needs to happen on the first contribution to the project. All subsequent contributions will fall under the same CLA.

================================================
FILE: Dockerfile
================================================
FROM node:20 AS builder

WORKDIR /frontend

COPY ./frontend/package.json ./frontend/yarn.lock ./

RUN yarn --network-timeout 600000 --frozen-lockfile

COPY ./frontend ./

RUN rm -rf .env

RUN yarn build

# Backend Dockerfile
FROM python:3.11

ARG TARGETOS
ARG TARGETARCH
ARG TARGETVARIANT

# Install system dependencies
RUN apt-get update && rm -rf /var/lib/apt/lists/*
RUN wget -O golang-migrate.deb https://github.com/golang-migrate/migrate/releases/download/v4.17.0/migrate.${TARGETOS}-${TARGETARCH}${TARGETVARIANT}.deb \
    && dpkg -i golang-migrate.deb \
    && rm golang-migrate.deb

# Install Poetry
RUN pip install poetry

# Set the working directory
WORKDIR /backend

# Copy only dependencies
COPY ./backend/pyproject.toml ./backend/poetry.lock* ./

# Install dependencies
# --only main: Skip installing packages listed in the [tool.poetry.dev-dependencies] section
RUN poetry config virtualenvs.create false \
    && poetry install --no-interaction --no-ansi --only main

# Copy the rest of backend
COPY ./backend .

# Copy the frontend build
COPY --from=builder /frontend/dist ./ui

ENTRYPOINT [ "uvicorn", "app.server:app", "--host", "0.0.0.0", "--log-config", "log_config.json" ]


================================================
FILE: LICENSE
================================================
MIT License

Copyright (c) LangChain, Inc.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

================================================
FILE: README.md
================================================
# OpenGPTs

This is an open source effort to create a similar experience to OpenAI's GPTs and Assistants API.
It is powered by [LangGraph](https://github.com/langchain-ai/langgraph) - a framework for creating agent runtimes.
It also builds upon [LangChain](https://github.com/langchain-ai/langchain), [LangServe](https://github.com/langchain-ai/langserve) and [LangSmith](https://smith.langchain.com/).
OpenGPTs gives you more control, allowing you to configure:

- The LLM you use (choose between the 60+ that LangChain offers)
- The prompts you use (use LangSmith to debug those)
- The tools you give it (choose from LangChain's 100+ tools, or easily write your own)
- The vector database you use (choose from LangChain's 60+ vector database integrations)
- The retrieval algorithm you use
- The chat history database you use

Most importantly, it gives you full control over the **cognitive architecture** of your application.
Currently, there are three different architectures implemented:

- Assistant
- RAG
- Chatbot

See below for more details on those.
Because this is open source, if you do not like those architectures or want to modify them, you can easily do that!

<p align="center">
    <img alt="Configure" src="_static/configure.png" width="49%" />
    <img alt="Chat" src="_static/chat.png" width="49%" />
</p>

**Key Links**

- [GPTs: a simple hosted version](https://opengpts-example-vz4y4ooboq-uc.a.run.app/)
- [Assistants API: a getting started guide](API.md)
- [Auth: a guide for production](auth.md)

## Quickstart with Docker

This project supports a Docker-based setup, streamlining installation and execution. It automatically builds images for 
the frontend and backend and sets up Postgres using docker-compose.


1. **Prerequisites:**  
    Ensure you have Docker and docker-compose installed on your system.


2. **Clone the Repository:**  
   Obtain the project files by cloning the repository.

   ```
   git clone https://github.com/langchain-ai/opengpts.git
   cd opengpts
   ```

3. **Set Up Environment Variables:**  
   Create a `.env` file in the root directory of the project by copying `.env.example` as a template, and add the 
   following environment variables:
   ```shell
   # At least one language model API key is required
   OPENAI_API_KEY=sk-...
   # LANGCHAIN_TRACING_V2=true
   # LANGCHAIN_API_KEY=...
   
   # Setup for Postgres. Docker compose will use these values to set up the database.
   POSTGRES_PORT=5432
   POSTGRES_DB=opengpts
   POSTGRES_USER=postgres
   POSTGRES_PASSWORD=...
   ```

   Replace `sk-...` with your OpenAI API key and `...` with your LangChain API key.


4. **Run with Docker Compose:**  
   In the root directory of the project, execute:

   ```
   docker compose up
   ```

   This command builds the Docker images for the frontend and backend from their respective Dockerfiles and starts all 
   necessary services, including Postgres.

5. **Access the Application:**  
   With the services running, access the frontend at [http://localhost:5173](http://localhost:5173), substituting `5173` with the 
   designated port number.


6. **Rebuilding After Changes:**  
   If you make changes to either the frontend or backend, rebuild the Docker images to reflect these changes. Run:
   ```
   docker compose up --build
   ```
   This command rebuilds the images with your latest changes and restarts the services.


## Quickstart without Docker

**Prerequisites**
The following instructions assume you have Python 3.11+ installed on your system. We strongly recommend using a virtual 
environment to manage dependencies.

For example, if you are using `pyenv`, you can create a new virtual environment with:
```shell
pyenv install 3.11
pyenv virtualenv 3.11 opengpts
pyenv activate opengpts
```

Once your Python environment is set up, you can install the project dependencies:

The backend service uses [poetry](https://python-poetry.org/docs/#installation) to manage dependencies.

```shell 
pip install poetry
pip install langchain-community
```

**Install Postgres and the Postgres Vector Extension**
```
brew install postgresql pgvector
brew services start postgresql
```

**Configure persistence layer**

The backend uses Postgres for saving agent configurations and chat message history.
In order to use this, you need to set the following environment variables:

```shell
export POSTGRES_HOST=localhost
export POSTGRES_PORT=5432
export POSTGRES_DB=opengpts
export POSTGRES_USER=postgres
export POSTGRES_PASSWORD=...
```

**Create the database**
```shell
createdb opengpts
```

**Connect to the database and create the `postgres` role**
```shell
psql -d opengpts
```

```sql
CREATE ROLE postgres WITH LOGIN SUPERUSER CREATEDB CREATEROLE;
```

**Install Golang Migrate**

Database migrations are managed with [golang-migrate](https://github.com/golang-migrate/migrate). 

On MacOS, you can install it with `brew install golang-migrate`. Instructions for other OSs or the Golang toolchain, 
can be found [here](https://github.com/golang-migrate/migrate/blob/master/cmd/migrate/README.md#installation).

Once `golang-migrate` is installed, you can run all the migrations with:
```shell
make migrate
```

This will enable the backend to use Postgres as a vector database and create the initial tables.


**Install backend dependencies**
```shell
cd backend
poetry install
```


**Alternate vector databases**

The instructions above use Postgres as a vector database,
although you can easily switch this out to use any of the 50+ vector databases in LangChain.

**Set up language models**

By default, this uses OpenAI, but there are also options for Azure OpenAI and Anthropic.
If you are using those, you may need to set different environment variables.

```shell
export OPENAI_API_KEY="sk-..."
```

Other language models can be used, and in order to use them you will need to set more environment variables.
See the section below on `LLMs` for how to configure Azure OpenAI, Anthropic, and Amazon Bedrock.

**Set up tools**

By default this uses a lot of tools.
Some of these require additional environment variables.
You do not need to use any of these tools, and the environment variables are not required to spin up the app
(they are only required if that tool is called).

For a full list of environment variables to enable, see the `Tools` section below.

**Set up monitoring**

Set up [LangSmith](https://smith.langchain.com/).
This is optional, but it will help with debugging, logging, monitoring.
Sign up at the link above and then set the relevant environment variables

```shell
export LANGCHAIN_TRACING_V2="true"
export LANGCHAIN_API_KEY=...
```

Start the backend server

```shell
make start
```

### Start the frontend

```shell
cd frontend
npm install
npm run dev
```

Navigate to [http://localhost:5173/](http://localhost:5173/) and enjoy!

## Migrating data from Redis to Postgres

Refer to this [guide](tools/redis_to_postgres/README.md) for migrating data from Redis to Postgres.

## Breaking Changes

### Migration 5 - Checkpoint Management Update
Version 5 of the database migrations introduces a significant change to how thread checkpoints are managed:
- Transitions from a pickle-based checkpointing system to a new multi-table checkpoint management system (breaking change)
- Aligns with LangGraph's new checkpoint architecture for better state management and persistence
- **Important**: Historical threads/checkpoints (created before this migration) will not be accessible in the UI
- Previous checkpoint data is preserved in the `old_checkpoints` table but cannot be accessed by the new system
- This architectural change improves how thread state is stored and managed, enabling more reliable state persistence in LangGraph-based agents.

## Features

As much as possible, we are striving for feature parity with OpenAI.

- [x] Sandbox - Provides an environment to import, test, and modify existing chatbots.
  - The chatbots used are all in code, so are easily editable
- [x] Custom Actions - Define additional functionality for your chatbot using OpenAPI specifications
  - Supported by adding tools
- [x] Knowledge Files - attach additional files that your chatbot can reference
  - Upload files from the UI or API, used by Retrieval tool
- [x] Tools - Provides basic tools for web browsing, image creation, etc.
  - Basic DuckDuckGo and PythonREPL tools enabled by default
  - Image creation coming soon
- [x] Analytics - View and analyze chatbot usage data
  - Use LangSmith for this
- [x] Drafts - Save and share drafts of chatbots you're creating
  - Supports saving of configurations
- [x] Publishing - publicly distribute your completed chatbot
  - Can do by deploying via LangServe
- [x] Sharing - Set up and manage chatbot sharing
  - Can do by deploying via LangServe
- [ ] Marketplace - Search and deploy chatbots created by other users
  - Coming soon

## Repo Structure

- `frontend`: Code for the frontend
- `backend`: Code for the backend
  - `app`: LangServe code (for exposing APIs)
  - `packages`: Core logic
    - `agent-executor`: Runtime for the agent
    - `gizmo-agent`: Configuration for the agent

## Customization

The big appeal of OpenGPTs as compared to using OpenAI directly is that it is more customizable.
Specifically, you can choose which language models to use as well as more easily add custom tools.
You can also use the underlying APIs directly and build a custom UI yourself should you choose.

### Cognitive Architecture

This refers to the logic of how the GPT works.
There are currently three different architectures supported, but because they are all written in LangGraph, it is very 
easy to modify them or add your own.

The three different architectures supported are assistants, RAG, and chatbots.

**Assistants**

Assistants can be equipped with arbitrary amount of tools and use an LLM to decide when to use them. This makes them 
the most flexible choice, but they work well with fewer models and can be less reliable.

When creating an assistant, you specify a few things.

First, you choose the language model to use. Only a few language models can be used reliably well: GPT-3.5, GPT-4, 
Claude, and Gemini.

Second, you choose the tools to use. These can be predefined tools OR a retriever constructed from uploaded files. You 
can choose however many you want.

The cognitive architecture can then be thought of as a loop. First, the LLM is called to determine what (if any) 
actions to take. If it decides to take actions, then those actions are executed and it loops back. If no actions are 
decided to take, then the response of the LLM is the final response, and it finishes the loop.

![](_static/agent.png)

This can be a really powerful and flexible architecture. This is probably closest to how us humans operate. However, 
these also can be not super reliable, and generally only work with the more performant models (and even then they can 
mess up). Therefore, we introduced a few simpler architecures.

Assistants are implemented with [LangGraph](https://github.com/langchain-ai/langgraph) `MessageGraph`. A `MessageGraph` is a graph that models its state as a `list` of messages.

**RAGBot**

One of the big use cases of the GPT store is uploading files and giving the bot knowledge of those files. What would it 
mean to make an architecture more focused on that use case?

We added RAGBot - a retrieval-focused GPT with a straightforward architecture. First, a set of documents are retrieved. 
Then, those documents are passed in the system message to a separate call to the language model so it can respond.

Compared to assistants, it is more structured (but less powerful). It ALWAYS looks up something - which is good if you 
know you want to look things up, but potentially wasteful if the user is just trying to have a normal conversation. 
Also importantly, this only looks up things once - so if it doesn't find the right results then it will yield a bad 
result (compared to an assistant, which could  decide to look things up again).

![](_static/rag.png)

Despite this being a more simple architecture, it is good for a few reasons. First, because it is simpler it can work 
pretty well with a wider variety of models (including lots of open source models). Second, if you have a use case where 
you don't NEED the flexibility of an assistant (eg you know users will be looking up information every time) then it 
can be more focused. And third, compared to the final architecture below it can use external knowledge.

RAGBot is implemented with [LangGraph](https://github.com/langchain-ai/langgraph) `StateGraph`. A `StateGraph` is a generalized graph that can model arbitrary state (i.e. `dict`), not just a `list` of messages.

**ChatBot**

The final architecture is dead simple - just a call to a language model, parameterized by a system message. This allows 
the GPT to take on different personas and characters. This is clearly far less powerful than Assistants or RAGBots 
(which have access to external sources of data/computation) - but it's still valuable! A lot of popular GPTs are just 
system messages at the end of the day, and CharacterAI is crushing it despite largely just being system messages as 
well.

![](_static/chatbot.png)

ChatBot is implemented with [LangGraph](https://github.com/langchain-ai/langgraph) `StateGraph`. A `StateGraph` is a generalized graph that can model arbitrary state (i.e. `dict`), not just a `list` of messages.

### LLMs

You can choose between different LLMs to use.
This takes advantage of LangChain's many integrations.
It is important to note that depending on which LLM you use, you may need to change how you are prompting it.

We have exposed four agent types by default:

- "GPT 3.5 Turbo"
- "GPT 4"
- "Azure OpenAI"
- "Claude 2"

We will work to add more when we have confidence they can work well.

If you want to add your own LLM or agent configuration, or want to edit the existing ones, you can find them in 
`backend/app/agent_types`

#### Claude 2

If using Claude 2, you will need to set the following environment variable:

```shell
export ANTHROPIC_API_KEY=sk-...
```

#### Azure OpenAI

If using Azure OpenAI, you will need to set the following environment variables:

```shell
export AZURE_OPENAI_API_BASE=...
export AZURE_OPENAI_API_VERSION=...
export AZURE_OPENAI_API_KEY=...
export AZURE_OPENAI_DEPLOYMENT_NAME=...
```

#### Amazon Bedrock

If using Amazon Bedrock, you either have valid credentials in `~/.aws/credentials` or set the following environment 
variables:

```shell
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
```

### Tools

One of the big benefits of having this be open source is that you can more easily add tools (directly in Python).

In practice, most teams we see define their own tools.
This is easy to do within LangChain.
See [this guide](https://python.langchain.com/docs/modules/agents/tools/custom_tools) for details on how to best do 
this.

If you want to use some preconfigured tools, these include:

**_Sema4.ai Action Server_**

Run AI Python based actions with [Sema4.ai Action Server](https://github.com/Sema4AI/actions).
Does not require a service API key, but it requires the credentials for a running Action Server instance to be defined.
These you set while creating an assistant.

**_Connery Actions_**

Connect OpenGPTs to the real world with [Connery](https://github.com/connery-io/connery).

Requires setting an environment variable, which you get during the [Connery Runner setup](https://docs.connery.io/docs/runner/quick-start/):

```shell
CONNERY_RUNNER_URL=https://your-personal-connery-runner-url
CONNERY_RUNNER_API_KEY=...
```

**DuckDuckGo Search**

Search the web with [DuckDuckGo](https://pypi.org/project/duckduckgo-search/). Does not require any API keys.

**Tavily Search**

Uses the [Tavily](https://app.tavily.com/) search engine. Requires setting an environment variable:

```shell
export TAVILY_API_KEY=tvly-...
```

Sign up for an API key [here](https://app.tavily.com/).

**Tavily Search (Answer Only)**

Uses the [Tavily](https://app.tavily.com/) search engine.
This returns only the answer, no supporting evidence.
Good when you need a short response (small context windows).
Requires setting an environment variable:

```shell
export TAVILY_API_KEY=tvly-...
```

Sign up for an API key [here](https://app.tavily.com/).

**You.com Search**

Uses [You.com](https://you.com/) search, optimized responses for LLMs.
Requires setting an environment variable:

```shell
export YDC_API_KEY=...
```

Sign up for an API key [here](https://you.com/)

**SEC Filings (Kay.ai)**

Searches through SEC filings using [Kay.ai](https://www.kay.ai/).
Requires setting an environment variable:

```shell
export KAY_API_KEY=...
```

Sign up for an API key [here](https://www.kay.ai/)

**Press Releases (Kay.ai)**

Searches through press releases using [Kay.ai](https://www.kay.ai/).
Requires setting an environment variable:

```shell
export KAY_API_KEY=...
```

Sign up for an API key [here](https://www.kay.ai/)

**Arxiv**

Searches [Arxiv](https://arxiv.org/). Does not require any API keys.

**PubMed**

Searches [PubMed](https://pubmed.ncbi.nlm.nih.gov/). Does not require any API keys.

**Wikipedia**

Searches [Wikipedia](https://pypi.org/project/wikipedia/). Does not require any API keys.

## Deployment

### Deploy via Cloud Run

**1. Build the frontend**

```shell
cd frontend
yarn
yarn build
```

**2. Deploy to Google Cloud Run**

You can deploy to GCP Cloud Run using the following command:

First create a `.env.gcp.yaml` file with the contents from `.env.gcp.yaml.example` and fill in the values. Then run:

```shell
gcloud run deploy opengpts --source . --port 8000 --env-vars-file .env.gcp.yaml --allow-unauthenticated \
--region us-central1 --min-instances 1
```

### Deploy in Kubernetes

We have a Helm chart for deploying the backend to Kubernetes. You can find more information here: 
[README.md](https://github.com/langchain-ai/helm/blob/main/charts/open-gpts/README.md)


================================================
FILE: auth.md
================================================
# Auth

By default, we're using cookies as a mock auth method. It's for trying out OpenGPTs.
For production, we recommend using JWT auth, outlined below.

## JWT Auth: Options

There are two ways to use JWT: Local and OIDC. The main difference is in how the key
used to decode the JWT is obtained. For the Local method, you'll provide the decode
key as a Base64-encoded string in an environment variable. For the OIDC method, the
key is obtained from the OIDC provider automatically.

### JWT OIDC

If you're looking to integrate with an identity provider, OIDC is the way to go.
It will figure out the decode key for you, so you don't have to worry about it.
Just set `AUTH_TYPE=jwt_oidc` along with the issuer and audience. Audience can
be one or many - just separate them with commas.

```bash
export AUTH_TYPE=jwt_oidc
export JWT_ISS=<issuer>
export JWT_AUD=<audience>  # or <audience1>,<audience2>,...
```

### JWT Local

To use JWT Local, set `AUTH_TYPE=jwt_local`. Then, set the issuer, audience,
algorithm used to sign the JWT, and the decode key in Base64 format.

```bash
export AUTH_TYPE=jwt_local
export JWT_ISS=<issuer>
export JWT_AUD=<audience>
export JWT_ALG=<algorithm>  # e.g. ES256
export JWT_DECODE_KEY_B64=<base64_decode_key>
```

Base64 is used for the decode key because handling multiline strings in environment
variables is error-prone. Base64 makes it a one-liner, easy to paste in and use.


## Making Requests

To make authenticated requests, include the JWT in the `Authorization` header as a Bearer token:

```
Authorization: Bearer <JWT>
```




================================================
FILE: backend/.gitignore
================================================
.envrc
ui


================================================
FILE: backend/Dockerfile
================================================
# Backend Dockerfile
FROM python:3.11

ARG TARGETOS
ARG TARGETARCH
ARG TARGETVARIANT

# Install system dependencies
RUN apt-get update && rm -rf /var/lib/apt/lists/*
RUN wget -O golang-migrate.deb https://github.com/golang-migrate/migrate/releases/download/v4.17.0/migrate.${TARGETOS}-${TARGETARCH}${TARGETVARIANT}.deb \
    && dpkg -i golang-migrate.deb \
    && rm golang-migrate.deb

# Install Poetry
RUN pip install poetry

# Set the working directory
WORKDIR /backend

# Copy only dependencies
COPY pyproject.toml poetry.lock* ./

# Install all dependencies
RUN poetry config virtualenvs.create false \
    && poetry install --no-interaction --no-ansi

# Copy the rest of application code
COPY . .

HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --start-interval=1s --retries=3 CMD [ "curl", "-f", "http://localhost:8000/health" ]

ENTRYPOINT [ "uvicorn", "app.server:app", "--host", "0.0.0.0", "--log-config", "log_config.json" ]


================================================
FILE: backend/Makefile
================================================
.PHONY: all lint format test help

# Default target executed when no arguments are given to make.
all: help

build_ui:
	cd ../frontend && yarn build && cp -r dist/* ../backend/ui

######################
# TESTING AND COVERAGE
######################

# Define a variable for the test file path.
TEST_FILE ?= tests/unit_tests/

start:
	poetry run uvicorn app.server:app --reload --port 8100 --log-config log_config.json

migrate:
	migrate -database postgres://$(POSTGRES_USER):$(POSTGRES_PASSWORD)@$(POSTGRES_HOST):$(POSTGRES_PORT)/$(POSTGRES_DB)?sslmode=disable -path ./migrations up

test:
	# We need to update handling of env variables for tests
	YDC_API_KEY=placeholder OPENAI_API_KEY=placeholder poetry run pytest $(TEST_FILE)


test_watch:
	# We need to update handling of env variables for tests
	YDC_API_KEY=placeholder OPENAI_API_KEY=placeholder poetry run ptw . -- $(TEST_FILE)

######################
# LINTING AND FORMATTING
######################

# Define a variable for Python and notebook files.
PYTHON_FILES=.
lint format: PYTHON_FILES=.
lint_diff format_diff: PYTHON_FILES=$(shell git diff --relative=. --name-only --diff-filter=d master | grep -E '\.py$$|\.ipynb$$')

lint lint_diff:
	poetry run ruff .
	poetry run ruff format $(PYTHON_FILES) --check

format format_diff:
	poetry run ruff format $(PYTHON_FILES)
	poetry run ruff --select I --fix $(PYTHON_FILES)

spell_check:
	poetry run codespell --toml pyproject.toml

spell_fix:
	poetry run codespell --toml pyproject.toml -w

######################
# HELP
######################

help:
	@echo '===================='
	@echo '-- LINTING --'
	@echo 'format                       - run code formatters'
	@echo 'lint                         - run linters'
	@echo 'spell_check                 	- run codespell on the project'
	@echo 'spell_fix                		- run codespell on the project and fix the errors'
	@echo '-- TESTS --'
	@echo 'coverage                     - run unit tests and generate coverage report'
	@echo 'test                         - run unit tests'
	@echo 'test TEST_FILE=<test_file>   - run all tests in file'
	@echo '-- DOCUMENTATION tasks are from the top-level Makefile --'


================================================
FILE: backend/README.md
================================================
# backend

## Database Migrations

### Migration 5 - Checkpoint Management Update
This migration introduces a significant change to thread checkpoint management:

#### Changes
- Transitions from single-table pickle storage to a robust multi-table checkpoint management system
- Implements LangGraph's latest checkpoint architecture for improved state persistence
- Preserves existing checkpoint data by renaming `checkpoints` table to `old_checkpoints`
- Introduces three new tables for better checkpoint management:
  - `checkpoints`: Core checkpoint metadata
  - `checkpoint_blobs`: Actual checkpoint data storage (compatible with LangGraph state serialization)
  - `checkpoint_writes`: Tracks checkpoint write operations
- Adds runtime initialization via `ensure_setup()` in the lifespan event

#### Impact
- **Breaking Change**: Historical threads/checkpoints (pre-migration) will not be accessible in the UI
- Previous checkpoint data remains preserved but inaccessible in the new system
- Designed to work seamlessly with LangGraph's state persistence requirements

#### Migration Details
- **Up Migration**: Safely preserves existing data by renaming the table
- **Down Migration**: Restores original table structure if needed
- New checkpoint management tables are automatically created at application startup


================================================
FILE: backend/app/__init__.py
================================================


================================================
FILE: backend/app/agent.py
================================================
from enum import Enum
from typing import Any, Dict, Mapping, Optional, Sequence, Union

from langchain_core.messages import AnyMessage
from langchain_core.runnables import (
    ConfigurableField,
    RunnableBinding,
)
from langgraph.graph.message import Messages
from langgraph.pregel import Pregel

from app.agent_types.tools_agent import get_tools_agent_executor
from app.agent_types.xml_agent import get_xml_agent_executor
from app.chatbot import get_chatbot_executor
from app.checkpoint import AsyncPostgresCheckpoint
from app.llms import (
    get_anthropic_llm,
    get_google_llm,
    get_mixtral_fireworks,
    get_ollama_llm,
    get_openai_llm,
)
from app.retrieval import get_retrieval_executor
from app.tools import (
    RETRIEVAL_DESCRIPTION,
    TOOLS,
    ActionServer,
    Arxiv,
    AvailableTools,
    Connery,
    DallE,
    DDGSearch,
    PressReleases,
    PubMed,
    Retrieval,
    SecFilings,
    Tavily,
    TavilyAnswer,
    Wikipedia,
    YouSearch,
    get_retrieval_tool,
    get_retriever,
)

Tool = Union[
    ActionServer,
    Connery,
    DDGSearch,
    Arxiv,
    YouSearch,
    SecFilings,
    PressReleases,
    PubMed,
    Wikipedia,
    Tavily,
    TavilyAnswer,
    Retrieval,
    DallE,
]


class AgentType(str, Enum):
    GPT_35_TURBO = "GPT 3.5 Turbo"
    GPT_4 = "GPT 4 Turbo"
    GPT_4O = "GPT 4o"
    AZURE_OPENAI = "GPT 4 (Azure OpenAI)"
    CLAUDE2 = "Claude 2"
    BEDROCK_CLAUDE2 = "Claude 2 (Amazon Bedrock)"
    GEMINI = "GEMINI"
    OLLAMA = "Ollama"


DEFAULT_SYSTEM_MESSAGE = "You are a helpful assistant."

CHECKPOINTER = AsyncPostgresCheckpoint()


def get_agent_executor(
    tools: list,
    agent: AgentType,
    system_message: str,
    interrupt_before_action: bool,
):
    if agent == AgentType.GPT_35_TURBO:
        llm = get_openai_llm()
        return get_tools_agent_executor(
            tools, llm, system_message, interrupt_before_action, CHECKPOINTER
        )
    elif agent == AgentType.GPT_4:
        llm = get_openai_llm(model="gpt-4-turbo")
        return get_tools_agent_executor(
            tools, llm, system_message, interrupt_before_action, CHECKPOINTER
        )
    elif agent == AgentType.GPT_4O:
        llm = get_openai_llm(model="gpt-4o")
        return get_tools_agent_executor(
            tools, llm, system_message, interrupt_before_action, CHECKPOINTER
        )
    elif agent == AgentType.AZURE_OPENAI:
        llm = get_openai_llm(azure=True)
        return get_tools_agent_executor(
            tools, llm, system_message, interrupt_before_action, CHECKPOINTER
        )
    elif agent == AgentType.CLAUDE2:
        llm = get_anthropic_llm()
        return get_tools_agent_executor(
            tools, llm, system_message, interrupt_before_action, CHECKPOINTER
        )
    elif agent == AgentType.BEDROCK_CLAUDE2:
        llm = get_anthropic_llm(bedrock=True)
        return get_xml_agent_executor(
            tools, llm, system_message, interrupt_before_action, CHECKPOINTER
        )
    elif agent == AgentType.GEMINI:
        llm = get_google_llm()
        return get_tools_agent_executor(
            tools, llm, system_message, interrupt_before_action, CHECKPOINTER
        )
    elif agent == AgentType.OLLAMA:
        llm = get_ollama_llm()
        return get_tools_agent_executor(
            tools, llm, system_message, interrupt_before_action, CHECKPOINTER
        )
    else:
        raise ValueError("Unexpected agent type")


class ConfigurableAgent(RunnableBinding):
    tools: Sequence[Tool]
    agent: AgentType
    system_message: str = DEFAULT_SYSTEM_MESSAGE
    retrieval_description: str = RETRIEVAL_DESCRIPTION
    interrupt_before_action: bool = False
    assistant_id: Optional[str] = None
    thread_id: Optional[str] = ""
    user_id: Optional[str] = None

    def __init__(
        self,
        *,
        tools: Sequence[Tool],
        agent: AgentType = AgentType.GPT_35_TURBO,
        system_message: str = DEFAULT_SYSTEM_MESSAGE,
        assistant_id: Optional[str] = None,
        thread_id: Optional[str] = "",
        retrieval_description: str = RETRIEVAL_DESCRIPTION,
        interrupt_before_action: bool = False,
        kwargs: Optional[Mapping[str, Any]] = None,
        config: Optional[Mapping[str, Any]] = None,
        **others: Any,
    ) -> None:
        others.pop("bound", None)
        _tools = []
        for _tool in tools:
            if _tool["type"] == AvailableTools.RETRIEVAL:
                if assistant_id is None or thread_id is None:
                    raise ValueError(
                        "Both assistant_id and thread_id must be provided if Retrieval tool is used"
                    )
                _tools.append(
                    get_retrieval_tool(assistant_id, thread_id, retrieval_description)
                )
            else:
                tool_config = _tool.get("config", {})
                _returned_tools = TOOLS[_tool["type"]](**tool_config)
                if isinstance(_returned_tools, list):
                    _tools.extend(_returned_tools)
                else:
                    _tools.append(_returned_tools)
        _agent = get_agent_executor(
            _tools, agent, system_message, interrupt_before_action
        )
        agent_executor = _agent.with_config({"recursion_limit": 50})
        super().__init__(
            tools=tools,
            agent=agent,
            system_message=system_message,
            retrieval_description=retrieval_description,
            bound=agent_executor,
            kwargs=kwargs or {},
            config=config or {},
        )


class LLMType(str, Enum):
    GPT_35_TURBO = "GPT 3.5 Turbo"
    GPT_4 = "GPT 4 Turbo"
    GPT_4O = "GPT 4o"
    AZURE_OPENAI = "GPT 4 (Azure OpenAI)"
    CLAUDE2 = "Claude 2"
    BEDROCK_CLAUDE2 = "Claude 2 (Amazon Bedrock)"
    GEMINI = "GEMINI"
    MIXTRAL = "Mixtral"
    OLLAMA = "Ollama"


def get_chatbot(
    llm_type: LLMType,
    system_message: str,
):
    if llm_type == LLMType.GPT_35_TURBO:
        llm = get_openai_llm()
    elif llm_type == LLMType.GPT_4:
        llm = get_openai_llm(model="gpt-4")
    elif llm_type == LLMType.GPT_4O:
        llm = get_openai_llm(model="gpt-4o")
    elif llm_type == LLMType.AZURE_OPENAI:
        llm = get_openai_llm(azure=True)
    elif llm_type == LLMType.CLAUDE2:
        llm = get_anthropic_llm()
    elif llm_type == LLMType.BEDROCK_CLAUDE2:
        llm = get_anthropic_llm(bedrock=True)
    elif llm_type == LLMType.GEMINI:
        llm = get_google_llm()
    elif llm_type == LLMType.MIXTRAL:
        llm = get_mixtral_fireworks()
    elif llm_type == LLMType.OLLAMA:
        llm = get_ollama_llm()
    else:
        raise ValueError("Unexpected llm type")
    return get_chatbot_executor(llm, system_message, CHECKPOINTER)


class ConfigurableChatBot(RunnableBinding):
    llm: LLMType
    system_message: str = DEFAULT_SYSTEM_MESSAGE
    user_id: Optional[str] = None

    def __init__(
        self,
        *,
        llm: LLMType = LLMType.GPT_35_TURBO,
        system_message: str = DEFAULT_SYSTEM_MESSAGE,
        kwargs: Optional[Mapping[str, Any]] = None,
        config: Optional[Mapping[str, Any]] = None,
        **others: Any,
    ) -> None:
        others.pop("bound", None)

        chatbot = get_chatbot(llm, system_message)
        super().__init__(
            llm=llm,
            system_message=system_message,
            bound=chatbot,
            kwargs=kwargs or {},
            config=config or {},
        )


chatbot = (
    ConfigurableChatBot(llm=LLMType.GPT_35_TURBO, checkpoint=CHECKPOINTER)
    .configurable_fields(
        llm=ConfigurableField(id="llm_type", name="LLM Type"),
        system_message=ConfigurableField(id="system_message", name="Instructions"),
    )
    .with_types(
        input_type=Messages,
        output_type=Sequence[AnyMessage],
    )
)


class ConfigurableRetrieval(RunnableBinding):
    llm_type: LLMType
    system_message: str = DEFAULT_SYSTEM_MESSAGE
    assistant_id: Optional[str] = None
    thread_id: Optional[str] = ""
    user_id: Optional[str] = None

    def __init__(
        self,
        *,
        llm_type: LLMType = LLMType.GPT_35_TURBO,
        system_message: str = DEFAULT_SYSTEM_MESSAGE,
        assistant_id: Optional[str] = None,
        thread_id: Optional[str] = "",
        kwargs: Optional[Mapping[str, Any]] = None,
        config: Optional[Mapping[str, Any]] = None,
        **others: Any,
    ) -> None:
        others.pop("bound", None)
        retriever = get_retriever(assistant_id, thread_id)
        if llm_type == LLMType.GPT_35_TURBO:
            llm = get_openai_llm()
        elif llm_type == LLMType.GPT_4:
            llm = get_openai_llm(model="gpt-4-turbo")
        elif llm_type == LLMType.GPT_4O:
            llm = get_openai_llm(model="gpt-4o")
        elif llm_type == LLMType.AZURE_OPENAI:
            llm = get_openai_llm(azure=True)
        elif llm_type == LLMType.CLAUDE2:
            llm = get_anthropic_llm()
        elif llm_type == LLMType.BEDROCK_CLAUDE2:
            llm = get_anthropic_llm(bedrock=True)
        elif llm_type == LLMType.GEMINI:
            llm = get_google_llm()
        elif llm_type == LLMType.MIXTRAL:
            llm = get_mixtral_fireworks()
        elif llm_type == LLMType.OLLAMA:
            llm = get_ollama_llm()
        else:
            raise ValueError("Unexpected llm type")
        chatbot = get_retrieval_executor(llm, retriever, system_message, CHECKPOINTER)
        super().__init__(
            llm_type=llm_type,
            system_message=system_message,
            bound=chatbot,
            kwargs=kwargs or {},
            config=config or {},
        )


chat_retrieval = (
    ConfigurableRetrieval(llm_type=LLMType.GPT_35_TURBO, checkpoint=CHECKPOINTER)
    .configurable_fields(
        llm_type=ConfigurableField(id="llm_type", name="LLM Type"),
        system_message=ConfigurableField(id="system_message", name="Instructions"),
        assistant_id=ConfigurableField(
            id="assistant_id", name="Assistant ID", is_shared=True
        ),
        thread_id=ConfigurableField(
            id="thread_id", name="Thread ID", annotation=str, is_shared=True
        ),
    )
    .with_types(
        input_type=Dict[str, Any],
        output_type=Dict[str, Any],
    )
)


agent: Pregel = (
    ConfigurableAgent(
        agent=AgentType.GPT_35_TURBO,
        tools=[],
        system_message=DEFAULT_SYSTEM_MESSAGE,
        retrieval_description=RETRIEVAL_DESCRIPTION,
        assistant_id=None,
        thread_id="",
    )
    .configurable_fields(
        agent=ConfigurableField(id="agent_type", name="Agent Type"),
        system_message=ConfigurableField(id="system_message", name="Instructions"),
        interrupt_before_action=ConfigurableField(
            id="interrupt_before_action",
            name="Tool Confirmation",
            description="If Yes, you'll be prompted to continue before each tool is executed.\nIf No, tools will be executed automatically by the agent.",
        ),
        assistant_id=ConfigurableField(
            id="assistant_id", name="Assistant ID", is_shared=True
        ),
        thread_id=ConfigurableField(
            id="thread_id", name="Thread ID", annotation=str, is_shared=True
        ),
        tools=ConfigurableField(id="tools", name="Tools"),
        retrieval_description=ConfigurableField(
            id="retrieval_description", name="Retrieval Description"
        ),
    )
    .configurable_alternatives(
        ConfigurableField(id="type", name="Bot Type"),
        default_key="agent",
        prefix_keys=True,
        chatbot=chatbot,
        chat_retrieval=chat_retrieval,
    )
    .with_types(
        input_type=Messages,
        output_type=Sequence[AnyMessage],
    )
)

if __name__ == "__main__":
    import asyncio

    from langchain.schema.messages import HumanMessage

    async def run():
        async for m in agent.astream_events(
            HumanMessage(content="whats your name"),
            config={"configurable": {"user_id": "2", "thread_id": "test1"}},
            version="v1",
        ):
            print(m)

    asyncio.run(run())


================================================
FILE: backend/app/agent_types/__init__.py
================================================


================================================
FILE: backend/app/agent_types/prompts.py
================================================
xml_template = """{system_message}

You have access to the following tools:

{tools}

In order to use a tool, you can use <tool></tool> and <tool_input></tool_input> tags. You will then get back a response in the form <observation></observation>
For example, if you have a tool called 'search' that could run a google search, in order to search for the weather in SF you would respond:

<tool>search</tool><tool_input>weather in SF</tool_input>
<observation>64 degrees</observation>

When you are done, you can respond as normal to the user.

Example 1:

Human: Hi!

Assistant: Hi! How are you?

Human: What is the weather in SF?
Assistant: <tool>search</tool><tool_input>weather in SF</tool_input>
<observation>64 degrees</observation>
It is 64 degrees in SF


Begin!"""


================================================
FILE: backend/app/agent_types/tools_agent.py
================================================
from typing import cast

from langchain.tools import BaseTool
from langchain_core.language_models.base import LanguageModelLike
from langchain_core.messages import (
    AIMessage,
    FunctionMessage,
    HumanMessage,
    SystemMessage,
    ToolMessage,
)
from langgraph.checkpoint.base import BaseCheckpointSaver
from langgraph.graph import END
from langgraph.graph.message import MessageGraph
from langgraph.prebuilt import ToolExecutor, ToolInvocation

from app.message_types import LiberalToolMessage


def get_tools_agent_executor(
    tools: list[BaseTool],
    llm: LanguageModelLike,
    system_message: str,
    interrupt_before_action: bool,
    checkpoint: BaseCheckpointSaver,
):
    async def _get_messages(messages):
        msgs = []
        for m in messages:
            if isinstance(m, LiberalToolMessage):
                _dict = m.model_dump()
                _dict["content"] = str(_dict["content"])
                m_c = ToolMessage(**_dict)
                msgs.append(m_c)
            elif isinstance(m, FunctionMessage):
                # anthropic doesn't like function messages
                msgs.append(HumanMessage(content=str(m.content)))
            else:
                msgs.append(m)

        return [SystemMessage(content=system_message)] + msgs

    if tools:
        llm_with_tools = llm.bind_tools(tools)
    else:
        llm_with_tools = llm
    agent = _get_messages | llm_with_tools
    tool_executor = ToolExecutor(tools)

    # Define the function that determines whether to continue or not
    def should_continue(messages):
        last_message = messages[-1]
        # If there is no function call, then we finish
        if not last_message.tool_calls:
            return "end"
        # Otherwise if there is, we continue
        else:
            return "continue"

    # Define the function to execute tools
    async def call_tool(messages):
        actions: list[ToolInvocation] = []
        # Based on the continue condition
        # we know the last message involves a function call
        last_message = cast(AIMessage, messages[-1])
        for tool_call in last_message.tool_calls:
            # We construct a ToolInvocation from the function_call
            actions.append(
                ToolInvocation(
                    tool=tool_call["name"],
                    tool_input=tool_call["args"],
                )
            )
        # We call the tool_executor and get back a response
        responses = await tool_executor.abatch(actions)
        # We use the response to create a ToolMessage
        tool_messages = [
            LiberalToolMessage(
                tool_call_id=tool_call["id"],
                name=tool_call["name"],
                content=response,
            )
            for tool_call, response in zip(last_message.tool_calls, responses)
        ]
        return tool_messages

    workflow = MessageGraph()

    # Define the two nodes we will cycle between
    workflow.add_node("agent", agent)
    workflow.add_node("action", call_tool)

    # Set the entrypoint as `agent`
    # This means that this node is the first one called
    workflow.set_entry_point("agent")

    # We now add a conditional edge
    workflow.add_conditional_edges(
        # First, we define the start node. We use `agent`.
        # This means these are the edges taken after the `agent` node is called.
        "agent",
        # Next, we pass in the function that will determine which node is called next.
        should_continue,
        # Finally we pass in a mapping.
        # The keys are strings, and the values are other nodes.
        # END is a special node marking that the graph should finish.
        # What will happen is we will call `should_continue`, and then the output of that
        # will be matched against the keys in this mapping.
        # Based on which one it matches, that node will then be called.
        {
            # If `tools`, then we call the tool node.
            "continue": "action",
            # Otherwise we finish.
            "end": END,
        },
    )

    # We now add a normal edge from `tools` to `agent`.
    # This means that after `tools` is called, `agent` node is called next.
    workflow.add_edge("action", "agent")

    # Finally, we compile it!
    # This compiles it into a LangChain Runnable,
    # meaning you can use it as you would any other runnable
    return workflow.compile(
        checkpointer=checkpoint,
        interrupt_before=["action"] if interrupt_before_action else None,
    )


================================================
FILE: backend/app/agent_types/xml_agent.py
================================================
from langchain.tools import BaseTool
from langchain.tools.render import render_text_description
from langchain_core.language_models.base import LanguageModelLike
from langchain_core.messages import (
    AIMessage,
    FunctionMessage,
    HumanMessage,
    SystemMessage,
)
from langgraph.checkpoint.base import BaseCheckpointSaver
from langgraph.graph import END
from langgraph.graph.message import MessageGraph
from langgraph.prebuilt import ToolExecutor, ToolInvocation

from app.agent_types.prompts import xml_template
from app.message_types import LiberalFunctionMessage


def _collapse_messages(messages):
    log = ""
    if isinstance(messages[-1], AIMessage):
        scratchpad = messages[:-1]
        final = messages[-1]
    else:
        scratchpad = messages
        final = None
    if len(scratchpad) % 2 != 0:
        raise ValueError("Unexpected")
    for i in range(0, len(scratchpad), 2):
        action = messages[i]
        observation = messages[i + 1]
        log += f"{action.content}<observation>{observation.content}</observation>"
    if final is not None:
        log += final.content
    return AIMessage(content=log)


def construct_chat_history(messages):
    collapsed_messages = []
    temp_messages = []
    for message in messages:
        if isinstance(message, HumanMessage):
            if temp_messages:
                collapsed_messages.append(_collapse_messages(temp_messages))
                temp_messages = []
            collapsed_messages.append(message)
        elif isinstance(message, LiberalFunctionMessage):
            _dict = message.model_dump()
            _dict["content"] = str(_dict["content"])
            m_c = FunctionMessage(**_dict)
            temp_messages.append(m_c)
        else:
            temp_messages.append(message)

    # Don't forget to add the last non-human message if it exists
    if temp_messages:
        collapsed_messages.append(_collapse_messages(temp_messages))

    return collapsed_messages


def get_xml_agent_executor(
    tools: list[BaseTool],
    llm: LanguageModelLike,
    system_message: str,
    interrupt_before_action: bool,
    checkpoint: BaseCheckpointSaver,
):
    formatted_system_message = xml_template.format(
        system_message=system_message,
        tools=render_text_description(tools),
        tool_names=", ".join([t.name for t in tools]),
    )

    llm_with_stop = llm.bind(stop=["</tool_input>", "<observation>"])

    def _get_messages(messages):
        return [
            SystemMessage(content=formatted_system_message)
        ] + construct_chat_history(messages)

    agent = _get_messages | llm_with_stop
    tool_executor = ToolExecutor(tools)

    # Define the function that determines whether to continue or not
    def should_continue(messages):
        last_message = messages[-1]
        if "</tool>" in last_message.content:
            return "continue"
        else:
            return "end"

    # Define the function to execute tools
    async def call_tool(messages):
        # Based on the continue condition
        # we know the last message involves a function call
        last_message = messages[-1]
        # We construct an ToolInvocation from the function_call
        tool, tool_input = last_message.content.split("</tool>")
        _tool = tool.split("<tool>")[1]
        if "<tool_input>" not in tool_input:
            _tool_input = ""
        else:
            _tool_input = tool_input.split("<tool_input>")[1]
            if "</tool_input>" in _tool_input:
                _tool_input = _tool_input.split("</tool_input>")[0]
        action = ToolInvocation(
            tool=_tool,
            tool_input=_tool_input,
        )
        # We call the tool_executor and get back a response
        response = await tool_executor.ainvoke(action)
        # We use the response to create a FunctionMessage
        function_message = LiberalFunctionMessage(content=response, name=action.tool)
        # We return a list, because this will get added to the existing list
        return function_message

    workflow = MessageGraph()

    # Define the two nodes we will cycle between
    workflow.add_node("agent", agent)
    workflow.add_node("action", call_tool)

    # Set the entrypoint as `agent`
    # This means that this node is the first one called
    workflow.set_entry_point("agent")

    # We now add a conditional edge
    workflow.add_conditional_edges(
        # First, we define the start node. We use `agent`.
        # This means these are the edges taken after the `agent` node is called.
        "agent",
        # Next, we pass in the function that will determine which node is called next.
        should_continue,
        # Finally we pass in a mapping.
        # The keys are strings, and the values are other nodes.
        # END is a special node marking that the graph should finish.
        # What will happen is we will call `should_continue`, and then the output of that
        # will be matched against the keys in this mapping.
        # Based on which one it matches, that node will then be called.
        {
            # If `tools`, then we call the tool node.
            "continue": "action",
            # Otherwise we finish.
            "end": END,
        },
    )

    # We now add a normal edge from `tools` to `agent`.
    # This means that after `tools` is called, `agent` node is called next.
    workflow.add_edge("action", "agent")

    # Finally, we compile it!
    # This compiles it into a LangChain Runnable,
    # meaning you can use it as you would any other runnable
    return workflow.compile(
        checkpointer=checkpoint,
        interrupt_before=["action"] if interrupt_before_action else None,
    )


================================================
FILE: backend/app/api/__init__.py
================================================
from fastapi import APIRouter

from app.api.assistants import router as assistants_router
from app.api.runs import router as runs_router
from app.api.threads import router as threads_router

router = APIRouter()


@router.get("/ok")
async def ok():
    return {"ok": True}


router.include_router(
    assistants_router,
    prefix="/assistants",
    tags=["assistants"],
)
router.include_router(
    runs_router,
    prefix="/runs",
    tags=["runs"],
)
router.include_router(
    threads_router,
    prefix="/threads",
    tags=["threads"],
)


================================================
FILE: backend/app/api/assistants.py
================================================
from typing import Annotated, List
from uuid import uuid4

from fastapi import APIRouter, HTTPException, Path
from pydantic import BaseModel, Field

import app.storage as storage
from app.auth.handlers import AuthedUser
from app.schema import Assistant

router = APIRouter()


class AssistantPayload(BaseModel):
    """Payload for creating an assistant."""

    name: Annotated[str, Field(description="The name of the assistant.")]
    config: Annotated[dict, Field(description="The assistant config.")]
    public: Annotated[
        bool, Field(default=False, description="Whether the assistant is public.")
    ]


AssistantID = Annotated[str, Path(description="The ID of the assistant.")]


@router.get("/")
async def list_assistants(user: AuthedUser) -> List[Assistant]:
    """List all assistants for the current user."""
    return await storage.list_assistants(user.user_id)


@router.get("/public/")
async def list_public_assistants() -> List[Assistant]:
    """List all public assistants."""
    return await storage.list_public_assistants()


@router.get("/{aid}")
async def get_assistant(
    user: AuthedUser,
    aid: AssistantID,
) -> Assistant:
    """Get an assistant by ID."""
    assistant = await storage.get_assistant(user.user_id, aid)
    if not assistant:
        raise HTTPException(status_code=404, detail="Assistant not found")
    return assistant


@router.post("")
async def create_assistant(
    user: AuthedUser,
    payload: AssistantPayload,
) -> Assistant:
    """Create an assistant."""
    return await storage.put_assistant(
        user.user_id,
        str(uuid4()),
        name=payload.name,
        config=payload.config,
        public=payload.public,
    )


@router.put("/{aid}")
async def upsert_assistant(
    user: AuthedUser,
    aid: AssistantID,
    payload: AssistantPayload,
) -> Assistant:
    """Create or update an assistant."""
    return await storage.put_assistant(
        user.user_id,
        aid,
        name=payload.name,
        config=payload.config,
        public=payload.public,
    )


@router.delete("/{aid}")
async def delete_assistant(
    user: AuthedUser,
    aid: AssistantID,
):
    """Delete an assistant by ID."""
    await storage.delete_assistant(user.user_id, aid)
    return {"status": "ok"}


================================================
FILE: backend/app/api/runs.py
================================================
from typing import Any, Dict, Optional, Sequence, Union
from uuid import UUID

import langsmith.client
from fastapi import APIRouter, BackgroundTasks, HTTPException
from fastapi.exceptions import RequestValidationError
from langchain_core.messages import AnyMessage
from langchain_core.runnables import RunnableConfig
from langsmith.utils import tracing_is_enabled
from pydantic import BaseModel, Field, ValidationError
from sse_starlette import EventSourceResponse

from app.agent import agent, chat_retrieval, chatbot
from app.auth.handlers import AuthedUser
from app.storage import get_assistant, get_thread
from app.stream import astream_state, to_sse

router = APIRouter()


class CreateRunPayload(BaseModel):
    """Payload for creating a run."""

    thread_id: str
    input: Optional[Union[Sequence[AnyMessage], Dict[str, Any]]] = Field(
        default_factory=dict
    )
    config: Optional[RunnableConfig] = None


async def _run_input_and_config(payload: CreateRunPayload, user_id: str):
    thread = await get_thread(user_id, payload.thread_id)
    if not thread:
        raise HTTPException(status_code=404, detail="Thread not found")

    assistant = await get_assistant(user_id, str(thread.assistant_id))
    if not assistant:
        raise HTTPException(status_code=404, detail="Assistant not found")

    config: RunnableConfig = {
        **assistant.config,
        "configurable": {
            **assistant.config["configurable"],
            **((payload.config or {}).get("configurable") or {}),
            "user_id": user_id,
            "thread_id": str(thread.thread_id),
            "assistant_id": str(assistant.assistant_id),
        },
    }

    try:
        if payload.input is not None:
            # Get the bot type from config
            bot_type = config["configurable"].get("type", "agent")
            # Get the correct schema based on bot type
            if bot_type == "chat_retrieval":
                schema = chat_retrieval.get_input_schema()
            elif bot_type == "chatbot":
                schema = chatbot.get_input_schema()
            else:  # default to agent
                schema = agent.get_input_schema()
            # Validate against the correct schema
            schema.model_validate(payload.input)
    except ValidationError as e:
        raise RequestValidationError(e.errors(), body=payload)

    return payload.input, config


@router.post("")
async def create_run(
    payload: CreateRunPayload,
    user: AuthedUser,
    background_tasks: BackgroundTasks,
):
    """Create a run."""
    input_, config = await _run_input_and_config(payload, user.user_id)
    background_tasks.add_task(agent.ainvoke, input_, config)
    return {"status": "ok"}  # TODO add a run id


@router.post("/stream")
async def stream_run(
    payload: CreateRunPayload,
    user: AuthedUser,
):
    """Create a run."""
    input_, config = await _run_input_and_config(payload, user.user_id)

    return EventSourceResponse(to_sse(astream_state(agent, input_, config)))


@router.get("/input_schema")
async def input_schema() -> dict:
    """Return the input schema of the runnable."""
    return agent.get_input_schema().model_json_schema()


@router.get("/output_schema")
async def output_schema() -> dict:
    """Return the output schema of the runnable."""
    return agent.get_output_schema().model_json_schema()


@router.get("/config_schema")
async def config_schema() -> dict:
    """Return the config schema of the runnable."""
    return agent.config_schema().model_json_schema()


if tracing_is_enabled():
    langsmith_client = langsmith.client.Client()

    class FeedbackCreateRequest(BaseModel):
        """
        Shared information between create requests of feedback and feedback objects
        """

        run_id: UUID
        """The associated run ID this feedback is logged for."""

        key: str
        """The metric name, tag, or aspect to provide feedback on."""

        score: Optional[Union[float, int, bool]] = None
        """Value or score to assign the run."""

        value: Optional[Union[float, int, bool, str, Dict]] = None
        """The display value for the feedback if not a metric."""

        comment: Optional[str] = None
        """Comment or explanation for the feedback."""

    @router.post("/feedback")
    def create_run_feedback(feedback_create_req: FeedbackCreateRequest) -> dict:
        """
        Send feedback on an individual run to langsmith

        Note that a successful response means that feedback was successfully
        submitted. It does not guarantee that the feedback is recorded by
        langsmith. Requests may be silently rejected if they are
        unauthenticated or invalid by the server.
        """

        langsmith_client.create_feedback(
            feedback_create_req.run_id,
            feedback_create_req.key,
            score=feedback_create_req.score,
            value=feedback_create_req.value,
            comment=feedback_create_req.comment,
            source_info={
                "from_langserve": True,
            },
        )

        return {"status": "ok"}


================================================
FILE: backend/app/api/threads.py
================================================
from typing import Annotated, Any, Dict, List, Optional, Sequence, Union
from uuid import uuid4

from fastapi import APIRouter, HTTPException, Path
from langchain.schema.messages import AnyMessage
from pydantic import BaseModel, Field

import app.storage as storage
from app.auth.handlers import AuthedUser
from app.schema import Thread

router = APIRouter()


ThreadID = Annotated[str, Path(description="The ID of the thread.")]


class ThreadPutRequest(BaseModel):
    """Payload for creating a thread."""

    name: Annotated[str, Field(description="The name of the thread.")]
    assistant_id: Annotated[str, Field(description="The ID of the assistant to use.")]


class ThreadPostRequest(BaseModel):
    """Payload for adding state to a thread."""

    values: Union[Sequence[AnyMessage], Dict[str, Any]]
    config: Optional[Dict[str, Any]] = None


@router.get("/")
async def list_threads(user: AuthedUser) -> List[Thread]:
    """List all threads for the current user."""
    return await storage.list_threads(user.user_id)


@router.get("/{tid}/state")
async def get_thread_state(
    user: AuthedUser,
    tid: ThreadID,
):
    """Get state for a thread."""
    thread = await storage.get_thread(user.user_id, tid)
    if not thread:
        raise HTTPException(status_code=404, detail="Thread not found")
    assistant = await storage.get_assistant(user.user_id, thread.assistant_id)
    if not assistant:
        raise HTTPException(status_code=400, detail="Thread has no assistant")
    return await storage.get_thread_state(
        user_id=user.user_id,
        thread_id=tid,
        assistant=assistant,
    )


@router.post("/{tid}/state")
async def add_thread_state(
    user: AuthedUser,
    tid: ThreadID,
    payload: ThreadPostRequest,
):
    """Add state to a thread."""
    thread = await storage.get_thread(user.user_id, tid)
    if not thread:
        raise HTTPException(status_code=404, detail="Thread not found")
    assistant = await storage.get_assistant(user.user_id, thread.assistant_id)
    if not assistant:
        raise HTTPException(status_code=400, detail="Thread has no assistant")
    return await storage.update_thread_state(
        payload.config or {"configurable": {"thread_id": tid}},
        payload.values,
        user_id=user.user_id,
        assistant=assistant,
    )


@router.get("/{tid}/history")
async def get_thread_history(
    user: AuthedUser,
    tid: ThreadID,
):
    """Get all past states for a thread."""
    thread = await storage.get_thread(user.user_id, tid)
    if not thread:
        raise HTTPException(status_code=404, detail="Thread not found")
    assistant = await storage.get_assistant(user.user_id, thread.assistant_id)
    if not assistant:
        raise HTTPException(status_code=400, detail="Thread has no assistant")
    return await storage.get_thread_history(
        user_id=user.user_id,
        thread_id=tid,
        assistant=assistant,
    )


@router.get("/{tid}")
async def get_thread(
    user: AuthedUser,
    tid: ThreadID,
) -> Thread:
    """Get a thread by ID."""
    thread = await storage.get_thread(user.user_id, tid)
    if not thread:
        raise HTTPException(status_code=404, detail="Thread not found")
    return thread


@router.post("")
async def create_thread(
    user: AuthedUser,
    thread_put_request: ThreadPutRequest,
) -> Thread:
    """Create a thread."""
    return await storage.put_thread(
        user.user_id,
        str(uuid4()),
        assistant_id=thread_put_request.assistant_id,
        name=thread_put_request.name,
    )


@router.put("/{tid}")
async def upsert_thread(
    user: AuthedUser,
    tid: ThreadID,
    thread_put_request: ThreadPutRequest,
) -> Thread:
    """Update a thread."""
    return await storage.put_thread(
        user.user_id,
        tid,
        assistant_id=thread_put_request.assistant_id,
        name=thread_put_request.name,
    )


@router.delete("/{tid}")
async def delete_thread(
    user: AuthedUser,
    tid: ThreadID,
):
    """Delete a thread by ID."""
    await storage.delete_thread(user.user_id, tid)
    return {"status": "ok"}


================================================
FILE: backend/app/auth/__init__.py
================================================


================================================
FILE: backend/app/auth/handlers.py
================================================
from abc import ABC, abstractmethod
from functools import lru_cache
from typing import Annotated

import jwt
import requests
from fastapi import Depends, HTTPException, Request
from fastapi.security.http import HTTPBearer

import app.storage as storage
from app.auth.settings import AuthType, settings
from app.schema import User


class AuthHandler(ABC):
    @abstractmethod
    async def __call__(self, request: Request) -> User:
        """Auth handler that returns a user object or raises an HTTPException."""


class NOOPAuth(AuthHandler):
    _default_sub = "static-default-user-id"

    async def __call__(self, request: Request) -> User:
        sub = request.cookies.get("opengpts_user_id") or self._default_sub
        user, _ = await storage.get_or_create_user(sub)
        return user


class JWTAuthBase(AuthHandler):
    async def __call__(self, request: Request) -> User:
        http_bearer = await HTTPBearer()(request)
        token = http_bearer.credentials

        try:
            payload = self.decode_token(token, self.get_decode_key(token))
        except jwt.PyJWTError as e:
            raise HTTPException(status_code=401, detail=str(e))

        user, _ = await storage.get_or_create_user(payload["sub"])
        return user

    @abstractmethod
    def decode_token(self, token: str, decode_key: str) -> dict:
        ...

    @abstractmethod
    def get_decode_key(self, token: str) -> str:
        ...


class JWTAuthLocal(JWTAuthBase):
    """Auth handler that uses a hardcoded decode key from env."""

    def decode_token(self, token: str, decode_key: str) -> dict:
        return jwt.decode(
            token,
            decode_key,
            issuer=settings.jwt_local.iss,
            audience=settings.jwt_local.aud,
            algorithms=[settings.jwt_local.alg.upper()],
            options={"require": ["exp", "iss", "aud", "sub"]},
        )

    def get_decode_key(self, token: str) -> str:
        return settings.jwt_local.decode_key


class JWTAuthOIDC(JWTAuthBase):
    """Auth handler that uses OIDC discovery to get the decode key."""

    def decode_token(self, token: str, decode_key: str) -> dict:
        alg = self._decode_complete_unverified(token)["header"]["alg"]
        return jwt.decode(
            token,
            decode_key,
            issuer=settings.jwt_oidc.iss,
            audience=settings.jwt_oidc.aud,
            algorithms=[alg.upper()],
            options={"require": ["exp", "iss", "aud", "sub"]},
        )

    def get_decode_key(self, token: str) -> str:
        unverified = self._decode_complete_unverified(token)
        issuer = unverified["payload"].get("iss")
        kid = unverified["header"].get("kid")
        return self._get_jwk_client(issuer).get_signing_key(kid).key

    @lru_cache
    def _decode_complete_unverified(self, token: str) -> dict:
        return jwt.api_jwt.decode_complete(token, options={"verify_signature": False})

    @lru_cache
    def _get_jwk_client(self, issuer: str) -> jwt.PyJWKClient:
        """
        lru_cache ensures a single instance of PyJWKClient per issuer. This is
        so that we can take advantage of jwks caching (and invalidation) handled
        by PyJWKClient.
        """
        url = issuer.rstrip("/") + "/.well-known/openid-configuration"
        config = requests.get(url).json()
        return jwt.PyJWKClient(config["jwks_uri"], cache_jwk_set=True)


@lru_cache(maxsize=1)
def get_auth_handler() -> AuthHandler:
    if settings.auth_type == AuthType.JWT_LOCAL:
        return JWTAuthLocal()
    elif settings.auth_type == AuthType.JWT_OIDC:
        return JWTAuthOIDC()
    return NOOPAuth()


async def auth_user(
    request: Request, auth_handler: AuthHandler = Depends(get_auth_handler)
):
    return await auth_handler(request)


AuthedUser = Annotated[User, Depends(auth_user)]


================================================
FILE: backend/app/auth/settings.py
================================================
import os
from base64 import b64decode
from enum import Enum
from typing import List, Optional, Union

from pydantic import ConfigDict, field_validator, model_validator
from pydantic_settings import BaseSettings


class AuthType(Enum):
    NOOP = "noop"
    JWT_LOCAL = "jwt_local"
    JWT_OIDC = "jwt_oidc"


class JWTSettingsBase(BaseSettings):
    iss: str
    aud: Union[str, list[str]]

    @field_validator("aud", mode="before")
    @classmethod
    def set_aud(cls, v) -> Union[str, List[str]]:
        if isinstance(v, str) and "," in v:
            return v.split(",")
        return v

    model_config = ConfigDict(
        env_prefix="jwt_",
    )


class JWTSettingsLocal(JWTSettingsBase):
    decode_key_b64: str
    decode_key: str = None
    alg: str

    @field_validator("decode_key", mode="before")
    @classmethod
    def set_decode_key(cls, v, info):
        """
        Key may be a multiline string (e.g. in the case of a public key), so to
        be able to set it from env, we set it as a base64 encoded string and
        decode it here.
        """
        decode_key_b64 = info.data.get("decode_key_b64")
        if decode_key_b64:
            return b64decode(decode_key_b64).decode("utf-8")
        return v


class JWTSettingsOIDC(JWTSettingsBase):
    ...


class Settings(BaseSettings):
    auth_type: AuthType
    jwt_local: Optional[JWTSettingsLocal] = None
    jwt_oidc: Optional[JWTSettingsOIDC] = None

    @model_validator(mode="before")
    @classmethod
    def check_jwt_settings(cls, values):
        auth_type = values.get("auth_type")
        if auth_type == AuthType.JWT_LOCAL and values.get("jwt_local") is None:
            raise ValueError(
                "jwt local settings must be set when auth type is jwt_local."
            )
        if auth_type == AuthType.JWT_OIDC and values.get("jwt_oidc") is None:
            raise ValueError(
                "jwt oidc settings must be set when auth type is jwt_oidc."
            )
        return values


auth_type = AuthType(os.getenv("AUTH_TYPE", AuthType.NOOP.value).lower())
kwargs = {"auth_type": auth_type}
if auth_type == AuthType.JWT_LOCAL:
    kwargs["jwt_local"] = JWTSettingsLocal()
elif auth_type == AuthType.JWT_OIDC:
    kwargs["jwt_oidc"] = JWTSettingsOIDC()
settings = Settings(**kwargs)


================================================
FILE: backend/app/chatbot.py
================================================
from typing import Annotated, List

from langchain_core.language_models.base import LanguageModelLike
from langchain_core.messages import BaseMessage, SystemMessage
from langgraph.checkpoint.base import BaseCheckpointSaver
from langgraph.graph.state import StateGraph

from app.message_types import add_messages_liberal


def get_chatbot_executor(
    llm: LanguageModelLike,
    system_message: str,
    checkpoint: BaseCheckpointSaver,
):
    def _get_messages(messages):
        return [SystemMessage(content=system_message)] + messages

    chatbot = _get_messages | llm

    workflow = StateGraph(Annotated[List[BaseMessage], add_messages_liberal])
    workflow.add_node("chatbot", chatbot)
    workflow.set_entry_point("chatbot")
    workflow.set_finish_point("chatbot")
    app = workflow.compile(checkpointer=checkpoint)
    return app


================================================
FILE: backend/app/checkpoint.py
================================================
import os
from typing import Any, AsyncIterator, Optional, Sequence

import structlog
from langgraph.checkpoint.base import (
    ChannelVersions,
    Checkpoint,
    CheckpointMetadata,
    CheckpointTuple,
    RunnableConfig,
)
from langgraph.checkpoint.postgres.aio import AsyncPostgresSaver
from langgraph.checkpoint.postgres.base import BasePostgresSaver
from langgraph.checkpoint.serde.base import SerializerProtocol
from psycopg import AsyncPipeline
from psycopg_pool import AsyncConnectionPool

logger = structlog.get_logger(__name__)


class AsyncPostgresCheckpoint(BasePostgresSaver):
    """A singleton implementation of AsyncPostgresSaver with separate setup."""

    _instance = None

    def __new__(cls, *args, **kwargs):
        if not cls._instance:
            cls._instance = super().__new__(cls)
        return cls._instance

    def __init__(
        self,
        pipe: Optional[AsyncPipeline] = None,
        serde: Optional[SerializerProtocol] = None,
    ) -> None:
        if not hasattr(self, "_initialized"):
            super().__init__(serde=serde)
            # Initialize basic attributes
            self.pipe = pipe
            self.serde = serde
            self._initialized = True
            self._setup_complete = False
            self.async_postgres_saver = None

    async def ensure_setup(self) -> None:
        """Ensure the instance is set up before use."""
        if not self._setup_complete:
            await self.setup()
            self._setup_complete = True

    async def setup(self) -> None:
        """Internal setup method."""
        try:
            conninfo = (
                f"postgresql://{os.environ['POSTGRES_USER']}:"
                f"{os.environ['POSTGRES_PASSWORD']}@"
                f"{os.environ['POSTGRES_HOST']}:"
                f"{os.environ['POSTGRES_PORT']}/"
                f"{os.environ['POSTGRES_DB']}"
            )

            pool = AsyncConnectionPool(
                conninfo=conninfo,
                kwargs={"autocommit": True, "prepare_threshold": 0},
                open=False,  # Don't open in constructor
            )
            await pool.open()

            self.async_postgres_saver = AsyncPostgresSaver(
                conn=pool, pipe=self.pipe, serde=self.serde
            )

            # Setup will create/migrate the tables if they don't exist
            await self.async_postgres_saver.setup()

            logger.warning("Checkpoint setup complete.")
        except Exception as e:
            logger.error(f"Failed to set up AsyncPostgresCheckpoint: {e}")
            raise

    async def alist(
        self,
        config: Optional[RunnableConfig],
        *,
        filter: Optional[dict[str, Any]] = None,
        before: Optional[RunnableConfig] = None,
        limit: Optional[int] = None,
    ) -> AsyncIterator[CheckpointTuple]:
        """List checkpoints from the database asynchronously."""
        async for checkpoint in self.async_postgres_saver.alist(
            config, filter=filter, before=before, limit=limit
        ):
            yield checkpoint

    async def aget_tuple(self, config: RunnableConfig) -> Optional[CheckpointTuple]:
        """Get a checkpoint tuple from the database asynchronously."""
        return await self.async_postgres_saver.aget_tuple(config)

    async def aput(
        self,
        config: RunnableConfig,
        checkpoint: Checkpoint,
        metadata: CheckpointMetadata,
        new_versions: ChannelVersions,
    ) -> RunnableConfig:
        """Save a checkpoint to the database asynchronously."""
        return await self.async_postgres_saver.aput(
            config, checkpoint, metadata, new_versions
        )

    async def aput_writes(
        self,
        config: RunnableConfig,
        writes: Sequence[tuple[str, Any]],
        task_id: str,
    ) -> None:
        """Store intermediate writes linked to a checkpoint asynchronously."""
        await self.async_postgres_saver.aput_writes(config, writes, task_id)


================================================
FILE: backend/app/ingest.py
================================================
"""Code to ingest blob into a vectorstore.

Code is responsible for taking binary data, parsing it and then indexing it
into a vector store.

This code should be agnostic to how the blob got generated; i.e., it does not
know about server/uploading etc.
"""
from typing import List

from langchain.text_splitter import TextSplitter
from langchain_community.document_loaders import Blob
from langchain_community.document_loaders.base import BaseBlobParser
from langchain_core.documents import Document
from langchain_core.vectorstores import VectorStore


def _update_document_metadata(document: Document, namespace: str) -> None:
    """Mutation in place that adds a namespace to the document metadata."""
    document.metadata["namespace"] = namespace


def _sanitize_document_content(document: Document) -> Document:
    """Sanitize the document."""
    # Without this, PDF ingestion fails with
    # "A string literal cannot contain NUL (0x00) characters".
    document.page_content = document.page_content.replace("\x00", "x")


# PUBLIC API


def ingest_blob(
    blob: Blob,
    parser: BaseBlobParser,
    text_splitter: TextSplitter,
    vectorstore: VectorStore,
    namespace: str,
    *,
    batch_size: int = 100,
) -> List[str]:
    """Ingest a document into the vectorstore."""
    docs_to_index = []
    ids = []
    for document in parser.lazy_parse(blob):
        docs = text_splitter.split_documents([document])
        for doc in docs:
            _sanitize_document_content(doc)
            _update_document_metadata(doc, namespace)
        docs_to_index.extend(docs)

        if len(docs_to_index) >= batch_size:
            ids.extend(vectorstore.add_documents(docs_to_index))
            docs_to_index = []

    if docs_to_index:
        ids.extend(vectorstore.add_documents(docs_to_index))

    return ids


================================================
FILE: backend/app/lifespan.py
================================================
import os
from contextlib import asynccontextmanager

import asyncpg
import orjson
import structlog
from fastapi import FastAPI

from app.checkpoint import AsyncPostgresCheckpoint

_pg_pool = None


def get_pg_pool() -> asyncpg.pool.Pool:
    return _pg_pool


async def _init_connection(conn) -> None:
    await conn.set_type_codec(
        "json",
        encoder=lambda v: orjson.dumps(v).decode(),
        decoder=orjson.loads,
        schema="pg_catalog",
    )
    await conn.set_type_codec(
        "jsonb",
        encoder=lambda v: orjson.dumps(v).decode(),
        decoder=orjson.loads,
        schema="pg_catalog",
    )
    await conn.set_type_codec(
        "uuid", encoder=lambda v: str(v), decoder=lambda v: v, schema="pg_catalog"
    )


@asynccontextmanager
async def lifespan(app: FastAPI):
    structlog.configure(
        processors=[
            structlog.stdlib.filter_by_level,
            structlog.stdlib.PositionalArgumentsFormatter(),
            structlog.processors.StackInfoRenderer(),
            structlog.processors.UnicodeDecoder(),
            structlog.stdlib.render_to_log_kwargs,
        ],
        logger_factory=structlog.stdlib.LoggerFactory(),
        wrapper_class=structlog.stdlib.BoundLogger,
        cache_logger_on_first_use=True,
    )

    global _pg_pool

    _pg_pool = await asyncpg.create_pool(
        database=os.environ["POSTGRES_DB"],
        user=os.environ["POSTGRES_USER"],
        password=os.environ["POSTGRES_PASSWORD"],
        host=os.environ["POSTGRES_HOST"],
        port=os.environ["POSTGRES_PORT"],
        init=_init_connection,
    )
    await AsyncPostgresCheckpoint().ensure_setup()
    yield
    await _pg_pool.close()
    _pg_pool = None


================================================
FILE: backend/app/llms.py
================================================
import os
from functools import lru_cache
from urllib.parse import urlparse

import boto3
import httpx
import structlog
from langchain_anthropic import ChatAnthropic
from langchain_community.chat_models import BedrockChat, ChatFireworks
from langchain_community.chat_models.ollama import ChatOllama
from langchain_google_vertexai import ChatVertexAI
from langchain_openai import AzureChatOpenAI, ChatOpenAI

logger = structlog.get_logger(__name__)


@lru_cache(maxsize=4)
def get_openai_llm(model: str = "gpt-3.5-turbo", azure: bool = False):
    proxy_url = os.getenv("PROXY_URL")
    http_client = None
    if proxy_url:
        parsed_url = urlparse(proxy_url)
        if parsed_url.scheme and parsed_url.netloc:
            http_client = httpx.AsyncClient(proxies=proxy_url)
        else:
            logger.warn("Invalid proxy URL provided. Proceeding without proxy.")

    if not azure:
        try:
            openai_model = model
            llm = ChatOpenAI(
                http_client=http_client,
                model=openai_model,
                temperature=0,
            )
        except Exception as e:
            logger.error(
                f"Failed to instantiate ChatOpenAI due to: {str(e)}. Falling back to AzureChatOpenAI."
            )
            llm = AzureChatOpenAI(
                http_client=http_client,
                temperature=0,
                deployment_name=os.environ["AZURE_OPENAI_DEPLOYMENT_NAME"],
                azure_endpoint=os.environ["AZURE_OPENAI_API_BASE"],
                openai_api_version=os.environ["AZURE_OPENAI_API_VERSION"],
                openai_api_key=os.environ["AZURE_OPENAI_API_KEY"],
            )
    else:
        llm = AzureChatOpenAI(
            http_client=http_client,
            temperature=0,
            deployment_name=os.environ["AZURE_OPENAI_DEPLOYMENT_NAME"],
            azure_endpoint=os.environ["AZURE_OPENAI_API_BASE"],
            openai_api_version=os.environ["AZURE_OPENAI_API_VERSION"],
            openai_api_key=os.environ["AZURE_OPENAI_API_KEY"],
        )
    return llm


@lru_cache(maxsize=2)
def get_anthropic_llm(bedrock: bool = False):
    if bedrock:
        client = boto3.client(
            "bedrock-runtime",
            region_name="us-west-2",
            aws_access_key_id=os.environ.get("AWS_ACCESS_KEY_ID"),
            aws_secret_access_key=os.environ.get("AWS_SECRET_ACCESS_KEY"),
        )
        model = BedrockChat(model_id="anthropic.claude-v2", client=client)
    else:
        model = ChatAnthropic(
            model_name="claude-3-haiku-20240307",
            max_tokens_to_sample=2000,
            temperature=0,
        )
    return model


@lru_cache(maxsize=1)
def get_google_llm():
    return ChatVertexAI(
        model_name="gemini-pro", convert_system_message_to_human=True, streaming=True
    )


@lru_cache(maxsize=1)
def get_mixtral_fireworks():
    return ChatFireworks(model="accounts/fireworks/models/mixtral-8x7b-instruct")


@lru_cache(maxsize=1)
def get_ollama_llm():
    model_name = os.environ.get("OLLAMA_MODEL")
    if not model_name:
        model_name = "llama2"
    ollama_base_url = os.environ.get("OLLAMA_BASE_URL")
    if not ollama_base_url:
        ollama_base_url = "http://localhost:11434"

    return ChatOllama(model=model_name, base_url=ollama_base_url)


================================================
FILE: backend/app/message_types.py
================================================
from typing import Any

from langchain_core.messages import (
    FunctionMessage,
    MessageLikeRepresentation,
    ToolMessage,
    _message_from_dict,
)
from langgraph.graph.message import Messages, add_messages
from pydantic import Field


class LiberalFunctionMessage(FunctionMessage):
    content: Any = Field(default="")


class LiberalToolMessage(ToolMessage):
    content: Any = Field(default="")


def _convert_pydantic_dict_to_message(
    data: MessageLikeRepresentation,
) -> MessageLikeRepresentation:
    """Convert a dictionary to a message object if it matches message format."""
    if (
        isinstance(data, dict)
        and "content" in data
        and isinstance(data.get("type"), str)
    ):
        _type = data.pop("type")
        return _message_from_dict({"data": data, "type": _type})
    return data


def add_messages_liberal(left: Messages, right: Messages):
    # coerce to list
    if not isinstance(left, list):
        left = [left]
    if not isinstance(right, list):
        right = [right]
    return add_messages(
        [_convert_pydantic_dict_to_message(m) for m in left],
        [_convert_pydantic_dict_to_message(m) for m in right],
    )


================================================
FILE: backend/app/parsing.py
================================================
"""Module contains logic for parsing binary blobs into text."""
from langchain_community.document_loaders.parsers import BS4HTMLParser, PDFMinerParser
from langchain_community.document_loaders.parsers.generic import MimeTypeBasedParser
from langchain_community.document_loaders.parsers.msword import MsWordParser
from langchain_community.document_loaders.parsers.txt import TextParser

HANDLERS = {
    "application/pdf": PDFMinerParser(),
    "text/plain": TextParser(),
    "text/html": BS4HTMLParser(),
    "application/msword": MsWordParser(),
    "application/vnd.openxmlformats-officedocument.wordprocessingml.document": (
        MsWordParser()
    ),
}

SUPPORTED_MIMETYPES = sorted(HANDLERS.keys())

# PUBLIC API

MIMETYPE_BASED_PARSER = MimeTypeBasedParser(
    handlers=HANDLERS,
    fallback_parser=None,
)


================================================
FILE: backend/app/retrieval.py
================================================
import operator
from typing import Annotated, List, Sequence, TypedDict
from uuid import uuid4

from langchain_core.language_models.base import LanguageModelLike
from langchain_core.messages import AIMessage, BaseMessage, HumanMessage, SystemMessage
from langchain_core.prompts import PromptTemplate
from langchain_core.retrievers import BaseRetriever
from langchain_core.runnables import chain
from langgraph.checkpoint.base import BaseCheckpointSaver
from langgraph.graph import END
from langgraph.graph.state import StateGraph

from app.message_types import LiberalToolMessage, add_messages_liberal

search_prompt = PromptTemplate.from_template(
    """Given the conversation below, come up with a search query to look up.

This search query can be either a few words or question

Return ONLY this search query, nothing more.

>>> Conversation:
{conversation}
>>> END OF CONVERSATION

Remember, return ONLY the search query that will help you when formulating a response to the above conversation."""
)


response_prompt_template = """{instructions}

Respond to the user using ONLY the context provided below. Do not make anything up.

{context}"""


def get_retrieval_executor(
    llm: LanguageModelLike,
    retriever: BaseRetriever,
    system_message: str,
    checkpoint: BaseCheckpointSaver,
):
    class AgentState(TypedDict):
        messages: Annotated[List[BaseMessage], add_messages_liberal]
        msg_count: Annotated[int, operator.add]

    def _get_messages(messages):
        chat_history = []
        for m in messages:
            if isinstance(m, AIMessage):
                if not m.tool_calls:
                    chat_history.append(m)
            if isinstance(m, HumanMessage):
                chat_history.append(m)
        response = messages[-1].content
        content = "\n".join([d["page_content"] for d in response])
        return [
            SystemMessage(
                content=response_prompt_template.format(
                    instructions=system_message, context=content
                )
            )
        ] + chat_history

    @chain
    async def get_search_query(messages: Sequence[BaseMessage]):
        convo = []
        for m in messages:
            if isinstance(m, AIMessage):
                if "function_call" not in m.additional_kwargs:
                    convo.append(f"AI: {m.content}")
            if isinstance(m, HumanMessage):
                convo.append(f"Human: {m.content}")
        conversation = "\n".join(convo)
        prompt = await search_prompt.ainvoke({"conversation": conversation})
        response = await llm.ainvoke(prompt, {"tags": ["nostream"]})
        return response

    async def invoke_retrieval(state: AgentState):
        messages = state["messages"]
        if len(messages) == 1:
            human_input = messages[-1].content
            return {
                "messages": [
                    AIMessage(
                        content="",
                        tool_calls=[
                            {
                                "id": uuid4().hex,
                                "name": "retrieval",
                                "args": {"query": human_input},
                            }
                        ],
                    )
                ]
            }
        else:
            search_query = await get_search_query.ainvoke(messages)
            return {
                "messages": [
                    AIMessage(
                        id=search_query.id,
                        content="",
                        tool_calls=[
                            {
                                "id": uuid4().hex,
                                "name": "retrieval",
                                "args": {"query": search_query.content},
                            }
                        ],
                    )
                ]
            }

    async def retrieve(state: AgentState):
        messages = state["messages"]
        params = messages[-1].tool_calls[0]
        query = params["args"]["query"]
        response = await retriever.ainvoke(query)
        response = [doc.model_dump() for doc in response]
        msg = LiberalToolMessage(
            name="retrieval", content=response, tool_call_id=params["id"]
        )
        return {"messages": [msg], "msg_count": 1}

    def call_model(state: AgentState):
        messages = state["messages"]
        response = llm.invoke(_get_messages(messages))
        return {"messages": [response], "msg_count": 1}

    workflow = StateGraph(AgentState)
    workflow.add_node("invoke_retrieval", invoke_retrieval)
    workflow.add_node("retrieve", retrieve)
    workflow.add_node("response", call_model)
    workflow.set_entry_point("invoke_retrieval")
    workflow.add_edge("invoke_retrieval", "retrieve")
    workflow.add_edge("retrieve", "response")
    workflow.add_edge("response", END)
    app = workflow.compile(checkpointer=checkpoint)
    return app


================================================
FILE: backend/app/schema.py
================================================
from datetime import datetime
from typing import Optional

from pydantic import BaseModel


class User(BaseModel):
    user_id: str
    """The ID of the user."""
    sub: str
    """The sub of the user (from a JWT token)."""
    created_at: datetime
    """The time the user was created."""


class Assistant(BaseModel):
    assistant_id: str
    """The ID of the assistant."""
    user_id: str
    """The ID of the user that owns the assistant."""
    name: str
    """The name of the assistant."""
    config: dict
    """The assistant config."""
    updated_at: datetime
    """The last time the assistant was updated."""
    public: bool = False
    """Whether the assistant is public."""


class Thread(BaseModel):
    thread_id: str
    """The ID of the thread."""
    user_id: str
    """The ID of the user that owns the thread."""
    assistant_id: Optional[str] = None
    """The assistant that was used in conjunction with this thread."""
    name: str
    """The name of the thread."""
    updated_at: datetime
    """The last time the thread was updated."""
    metadata: Optional[dict] = None


================================================
FILE: backend/app/server.py
================================================
import os
from pathlib import Path

import orjson
import structlog
from fastapi import FastAPI, Form, UploadFile
from fastapi.exceptions import HTTPException
from fastapi.staticfiles import StaticFiles

import app.storage as storage
from app.api import router as api_router
from app.auth.handlers import AuthedUser
from app.lifespan import lifespan
from app.upload import convert_ingestion_input_to_blob, ingest_runnable

logger = structlog.get_logger(__name__)

app = FastAPI(title="OpenGPTs API", lifespan=lifespan)


# Get root of app, used to point to directory containing static files
ROOT = Path(__file__).parent.parent


app.include_router(api_router)


@app.post("/ingest", description="Upload files to the given assistant.")
async def ingest_files(
    files: list[UploadFile], user: AuthedUser, config: str = Form(...)
) -> None:
    """Ingest a list of files."""
    config = orjson.loads(config)

    assistant_id = config["configurable"].get("assistant_id")
    if assistant_id is not None:
        assistant = await storage.get_assistant(user.user_id, assistant_id)
        if assistant is None:
            raise HTTPException(status_code=404, detail="Assistant not found.")

    thread_id = config["configurable"].get("thread_id")
    if thread_id is not None:
        thread = await storage.get_thread(user.user_id, thread_id)
        if thread is None:
            raise HTTPException(status_code=404, detail="Thread not found.")

    file_blobs = [convert_ingestion_input_to_blob(file) for file in files]
    return ingest_runnable.batch(file_blobs, config)


@app.get("/health")
async def health() -> dict:
    return {"status": "ok"}


ui_dir = str(ROOT / "ui")

if os.path.exists(ui_dir):
    app.mount("", StaticFiles(directory=ui_dir, html=True), name="ui")
else:
    logger.warn("No UI directory found, serving API only.")

if __name__ == "__main__":
    import uvicorn

    uvicorn.run(app, host="0.0.0.0", port=8100)


================================================
FILE: backend/app/storage.py
================================================
from datetime import datetime, timezone
from typing import Any, List, Optional, Sequence, Union

from langchain_core.messages import AnyMessage
from langchain_core.runnables import RunnableConfig

from app.agent import agent
from app.lifespan import get_pg_pool
from app.schema import Assistant, Thread, User


async def list_assistants(user_id: str) -> List[Assistant]:
    """List all assistants for the current user."""
    async with get_pg_pool().acquire() as conn:
        records = await conn.fetch(
            "SELECT * FROM assistant WHERE user_id = $1", user_id
        )
        return [Assistant(**record) for record in records]


async def get_assistant(user_id: str, assistant_id: str) -> Optional[Assistant]:
    """Get an assistant by ID."""
    async with get_pg_pool().acquire() as conn:
        record = await conn.fetchrow(
            "SELECT * FROM assistant WHERE assistant_id = $1 AND (user_id = $2 OR public IS true)",
            assistant_id,
            user_id,
        )
        if record is None:
            return None
        return Assistant(**record)


async def list_public_assistants() -> List[Assistant]:
    """List all the public assistants."""
    async with get_pg_pool().acquire() as conn:
        records = await conn.fetch("SELECT * FROM assistant WHERE public IS true")
        return [Assistant(**record) for record in records]


async def put_assistant(
    user_id: str, assistant_id: str, *, name: str, config: dict, public: bool = False
) -> Assistant:
    """Modify an assistant.

    Args:
        user_id: The user ID.
        assistant_id: The assistant ID.
        name: The assistant name.
        config: The assistant config.
        public: Whether the assistant is public.

    Returns:
        return the assistant model if no exception is raised.
    """
    updated_at = datetime.now(timezone.utc)
    async with get_pg_pool().acquire() as conn:
        async with conn.transaction():
            await conn.execute(
                (
                    "INSERT INTO assistant (assistant_id, user_id, name, config, updated_at, public) VALUES ($1, $2, $3, $4, $5, $6) "
                    "ON CONFLICT (assistant_id) DO UPDATE SET "
                    "user_id = EXCLUDED.user_id, "
                    "name = EXCLUDED.name, "
                    "config = EXCLUDED.config, "
                    "updated_at = EXCLUDED.updated_at, "
                    "public = EXCLUDED.public;"
                ),
                assistant_id,
                user_id,
                name,
                config,
                updated_at,
                public,
            )
    return Assistant(
        assistant_id=assistant_id,
        user_id=user_id,
        name=name,
        config=config,
        updated_at=updated_at,
        public=public,
    )


async def delete_assistant(user_id: str, assistant_id: str) -> None:
    """Delete an assistant by ID."""
    async with get_pg_pool().acquire() as conn:
        await conn.execute(
            "DELETE FROM assistant WHERE assistant_id = $1 AND user_id = $2",
            assistant_id,
            user_id,
        )


async def list_threads(user_id: str) -> List[Thread]:
    """List all threads for the current user."""
    async with get_pg_pool().acquire() as conn:
        records = await conn.fetch("SELECT * FROM thread WHERE user_id = $1", user_id)
        return [Thread(**record) for record in records]


async def get_thread(user_id: str, thread_id: str) -> Optional[Thread]:
    """Get a thread by ID."""
    async with get_pg_pool().acquire() as conn:
        record = await conn.fetchrow(
            "SELECT * FROM thread WHERE thread_id = $1 AND user_id = $2",
            thread_id,
            user_id,
        )
        if record is None:
            return None
        return Thread(**record)


async def get_thread_state(*, user_id: str, thread_id: str, assistant: Assistant):
    """Get state for a thread."""
    state = await agent.aget_state(
        {
            "configurable": {
                **assistant.config["configurable"],
                "thread_id": thread_id,
                "assistant_id": assistant.assistant_id,
            }
        }
    )
    # Keep original format - return values as is
    values = state.values if state.values else None

    return {
        "values": values,
        "next": state.next,
    }


async def update_thread_state(
    config: RunnableConfig,
    values: Union[Sequence[AnyMessage], dict[str, Any]],
    *,
    user_id: str,
    assistant: Assistant,
):
    """Add state to a thread."""
    # Get the current state to determine the format
    current_state = await agent.aget_state(
        {
            "configurable": {
                **assistant.config["configurable"],
                **config["configurable"],
                "assistant_id": assistant.assistant_id,
            }
        }
    )

    # If current state is a dict (retrieval agent), maintain dict structure
    if current_state.values and isinstance(current_state.values, dict):
        if isinstance(values, dict):
            state_values = values
        else:
            # Update just the messages in the existing state
            state_values = {**current_state.values, "messages": values}
    else:
        # For message-only states (tools_agent, chatbot), just use the messages
        state_values = (
            values if isinstance(values, dict) and "messages" in values else values
        )

    await agent.aupdate_state(
        {
            "configurable": {
                **assistant.config["configurable"],
                **config["configurable"],
                "assistant_id": assistant.assistant_id,
            }
        },
        state_values,
    )


async def get_thread_history(*, user_id: str, thread_id: str, assistant: Assistant):
    """Get the history of a thread."""
    return [
        {
            "values": c.values,
            "next": c.next,
            "config": c.config,
            "parent": c.parent_config,
        }
        async for c in agent.aget_state_history(
            {
                "configurable": {
                    **assistant.config["configurable"],
                    "thread_id": thread_id,
                    "assistant_id": assistant.assistant_id,
                }
            }
        )
    ]


def get_assistant_type(config: dict) -> str:
    """Extract assistant type from config, handling both old and new formats."""
    configurable = config.get("configurable", {})

    # First try direct type key (old format)
    if "type" in configurable:
        return configurable["type"]

    # Default fallback
    return "chatbot"


async def put_thread(
    user_id: str, thread_id: str, *, assistant_id: str, name: str
) -> Thread:
    """Modify a thread."""
    updated_at = datetime.now(timezone.utc)
    assistant = await get_assistant(user_id, assistant_id)
    metadata = (
        {"assistant_type": get_assistant_type(assistant.config)} if assistant else None
    )
    async with get_pg_pool().acquire() as conn:
        await conn.execute(
            (
                "INSERT INTO thread (thread_id, user_id, assistant_id, name, updated_at, metadata) VALUES ($1, $2, $3, $4, $5, $6) "
                "ON CONFLICT (thread_id) DO UPDATE SET "
                "user_id = EXCLUDED.user_id,"
                "assistant_id = EXCLUDED.assistant_id, "
                "name = EXCLUDED.name, "
                "updated_at = EXCLUDED.updated_at, "
                "metadata = EXCLUDED.metadata;"
            ),
            thread_id,
            user_id,
            assistant_id,
            name,
            updated_at,
            metadata,
        )
        return Thread(
            thread_id=thread_id,
            user_id=user_id,
            assistant_id=assistant_id,
            name=name,
            updated_at=updated_at,
            metadata=metadata,
        )


async def delete_thread(user_id: str, thread_id: str):
    """Delete a thread by ID."""
    async with get_pg_pool().acquire() as conn:
        await conn.execute(
            "DELETE FROM thread WHERE thread_id = $1 AND user_id = $2",
            thread_id,
            user_id,
        )


async def get_or_create_user(sub: str) -> tuple[User, bool]:
    """Returns a tuple of the user and a boolean indicating whether the user was created."""
    async with get_pg_pool().acquire() as conn:
        if record := await conn.fetchrow('SELECT * FROM "user" WHERE sub = $1', sub):
            return User(**record), False
        record = await conn.fetchrow(
            'INSERT INTO "user" (sub) VALUES ($1) RETURNING *', sub
        )
        return User(**record), True


================================================
FILE: backend/app/stream.py
================================================
import functools
from typing import Any, AsyncIterator, Dict, Optional, Sequence, Union

import orjson
import structlog
from langchain_core.messages import AnyMessage, BaseMessage, message_chunk_to_message
from langchain_core.runnables import Runnable, RunnableConfig

logger = structlog.get_logger(__name__)

MessagesStream = AsyncIterator[Union[list[AnyMessage], str]]


async def astream_state(
    app: Runnable,
    input: Union[Sequence[AnyMessage], Dict[str, Any]],
    config: RunnableConfig,
) -> MessagesStream:
    """Stream messages from the runnable."""
    root_run_id: Optional[str] = None
    messages: dict[str, BaseMessage] = {}

    async for event in app.astream_events(
        input, config, version="v1", stream_mode="values", exclude_tags=["nostream"]
    ):
        if event["event"] == "on_chain_start" and not root_run_id:
            root_run_id = event["run_id"]
            yield root_run_id
        elif event["event"] == "on_chain_stream" and event["run_id"] == root_run_id:
            new_messages: list[BaseMessage] = []

            # event["data"]["chunk"] is a Sequence[AnyMessage] or a Dict[str, Any]
            state_chunk_msgs: Union[Sequence[AnyMessage], Dict[str, Any]] = event[
                "data"
            ]["chunk"]
            if isinstance(state_chunk_msgs, dict):
                state_chunk_msgs = event["data"]["chunk"]["messages"]

            for msg in state_chunk_msgs:
                msg_id = msg["id"] if isinstance(msg, dict) else msg.id
                if msg_id in messages and msg == messages[msg_id]:
                    continue
                else:
                    messages[msg_id] = msg
                    new_messages.append(msg)
            if new_messages:
                yield new_messages
        elif event["event"] == "on_chat_model_stream":
            message: BaseMessage = event["data"]["chunk"]
            if message.id not in messages:
                messages[message.id] = message
            else:
                messages[message.id] += message
            yield [messages[message.id]]


def _default(obj) -> Any:
    if hasattr(obj, "dict") and callable(obj.dict):
        return obj.dict()
    raise TypeError(f"Object of type {obj.__class__.__name__} is not JSON serializable")


dumps = functools.partial(orjson.dumps, default=_default)


async def to_sse(messages_stream: MessagesStream) -> AsyncIterator[dict]:
    """Consume the stream into an EventSourceResponse"""
    try:
        async for chunk in messages_stream:
            # EventSourceResponse expects a string for data
            # so after serializing into bytes, we decode into utf-8
            # to get a string.
            if isinstance(chunk, str):
                yield {
                    "event": "metadata",
                    "data": orjson.dumps({"run_id": chunk}).decode(),
                }
            else:
                yield {
                    "event": "data",
                    "data": dumps(
                        [message_chunk_to_message(msg) for msg in chunk]
                    ).decode(),
                }
    except Exception:
        logger.warn("error in stream", exc_info=True)
        yield {
            "event": "error",
            # Do not expose the error message to the client since
            # the message may contain sensitive information.
            # We'll add client side errors for validation as well.
            "data": orjson.dumps(
                {"status_code": 500, "message": "Internal Server Error"}
            ).decode(),
        }

    # Send an end event to signal the end of the stream
    yield {"event": "end"}


================================================
FILE: backend/app/tools.py
================================================
from enum import Enum
from functools import lru_cache
from typing import Annotated, Literal

from langchain.tools.retriever import create_retriever_tool
from langchain_community.agent_toolkits.connery import ConneryToolkit
from langchain_community.retrievers.kay import KayAiRetriever
from langchain_community.retrievers.pubmed import PubMedRetriever
from langchain_community.retrievers.wikipedia import WikipediaRetriever
from langchain_community.retrievers.you import YouRetriever
from langchain_community.tools.arxiv.tool import ArxivQueryRun
from langchain_community.tools.connery import ConneryService
from langchain_community.tools.ddg_search.tool import DuckDuckGoSearchRun
from langchain_community.tools.tavily_search import (
    TavilyAnswer as _TavilyAnswer,
)
from langchain_community.tools.tavily_search import (
    TavilySearchResults,
)
from langchain_community.utilities.arxiv import ArxivAPIWrapper
from langchain_community.utilities.dalle_image_generator import DallEAPIWrapper
from langchain_community.utilities.tavily_search import TavilySearchAPIWrapper
from langchain_core.tools import Tool
from pydantic import BaseModel, Field
from typing_extensions import TypedDict

from app.upload import vstore


class DDGInput(BaseModel):
    query: Annotated[str, Field(description="search query to look up")]


class ArxivInput(BaseModel):
    query: Annotated[str, Field(description="search query to look up")]


class PythonREPLInput(BaseModel):
    query: Annotated[str, Field(description="python command to run")]


class DallEInput(BaseModel):
    query: Annotated[str, Field(description="image description to generate image from")]


class AvailableTools(str, Enum):
    ACTION_SERVER = "action_server_by_sema4ai"
    CONNERY = "ai_action_runner_by_connery"
    DDG_SEARCH = "ddg_search"
    TAVILY = "search_tavily"
    TAVILY_ANSWER = "search_tavily_answer"
    RETRIEVAL = "retrieval"
    ARXIV = "arxiv"
    YOU_SEARCH = "you_search"
    SEC_FILINGS = "sec_filings_kai_ai"
    PRESS_RELEASES = "press_releases_kai_ai"
    PUBMED = "pubmed"
    WIKIPEDIA = "wikipedia"
    DALL_E = "dall_e"


class ToolConfig(TypedDict):
    ...


class BaseTool(BaseModel):
    type: AvailableTools
    name: str
    description: str
    config: ToolConfig = Field(default_factory=dict)
    multi_use: bool = False


class ActionServerConfig(ToolConfig):
    url: str
    api_key: str


class ActionServer(BaseTool):
    type: Literal[AvailableTools.ACTION_SERVER] = AvailableTools.ACTION_SERVER
    name: Literal["Action Server by Sema4.ai"] = "Action Server by Sema4.ai"
    description: Literal[
        (
            "Run AI actions with "
            "[Sema4.ai Action Server](https://github.com/Sema4AI/actions)."
        )
    ] = (
        "Run AI actions with "
        "[Sema4.ai Action Server](https://github.com/Sema4AI/actions)."
    )
    config: ActionServerConfig
    multi_use: Literal[True] = True


class Connery(BaseTool):
    type: Literal[AvailableTools.CONNERY] = AvailableTools.CONNERY
    name: Literal["AI Action Runner by Connery"] = "AI Action Runner by Connery"
    description: Literal[
        (
            "Connect OpenGPTs to the real world with "
            "[Connery](https://github.com/connery-io/connery)."
        )
    ] = (
        "Connect OpenGPTs to the real world with "
        "[Connery](https://github.com/connery-io/connery)."
    )


class DDGSearch(BaseTool):
    type: Literal[AvailableTools.DDG_SEARCH] = AvailableTools.DDG_SEARCH
    name: Literal["DuckDuckGo Search"] = "DuckDuckGo Search"
    description: Literal[
        "Search the web with [DuckDuckGo](https://pypi.org/project/duckduckgo-search/)."
    ] = "Search the web with [DuckDuckGo](https://pypi.org/project/duckduckgo-search/)."


class Arxiv(BaseTool):
    type: Literal[AvailableTools.ARXIV] = AvailableTools.ARXIV
    name: Literal["Arxiv"] = "Arxiv"
    description: Literal[
        "Searches [Arxiv](https://arxiv.org/)."
    ] = "Searches [Arxiv](https://arxiv.org/)."


class YouSearch(BaseTool):
    type: Literal[AvailableTools.YOU_SEARCH] = AvailableTools.YOU_SEARCH
    name: Literal["You.com Search"] = "You.com Search"
    description: Literal[
        "Uses [You.com](https://you.com/) search, optimized responses for LLMs."
    ] = "Uses [You.com](https://you.com/) search, optimized responses for LLMs."


class SecFilings(BaseTool):
    type: Literal[AvailableTools.SEC_FILINGS] = AvailableTools.SEC_FILINGS
    name: Literal["SEC Filings (Kay.ai)"] = "SEC Filings (Kay.ai)"
    description: Literal[
        "Searches through SEC filings using [Kay.ai](https://www.kay.ai/)."
    ] = "Searches through SEC filings using [Kay.ai](https://www.kay.ai/)."


class PressReleases(BaseTool):
    type: Literal[AvailableTools.PRESS_RELEASES] = AvailableTools.PRESS_RELEASES
    name: Literal["Press Releases (Kay.ai)"] = "Press Releases (Kay.ai)"
    description: Literal[
        "Searches through press releases using [Kay.ai](https://www.kay.ai/)."
    ] = "Searches through press releases using [Kay.ai](https://www.kay.ai/)."


class PubMed(BaseTool):
    type: Literal[AvailableTools.PUBMED] = AvailableTools.PUBMED
    name: Literal["PubMed"] = "PubMed"
    description: Literal[
        "Searches [PubMed](https://pubmed.ncbi.nlm.nih.gov/)."
    ] = "Searches [PubMed](https://pubmed.ncbi.nlm.nih.gov/)."


class Wikipedia(BaseTool):
    type: Literal[AvailableTools.WIKIPEDIA] = AvailableTools.WIKIPEDIA
    name: Literal["Wikipedia"] = "Wikipedia"
    description: Literal[
        "Searches [Wikipedia](https://pypi.org/project/wikipedia/)."
    ] = "Searches [Wikipedia](https://pypi.org/project/wikipedia/)."


class Tavily(BaseTool):
    type: Literal[AvailableTools.TAVILY] = AvailableTools.TAVILY
    name: Literal["Search (Tavily)"] = "Search (Tavily)"
    description: Literal[
        (
            "Uses the [Tavily](https://app.tavily.com/) search engine. "
            "Includes sources in the response."
        )
    ] = (
        "Uses the [Tavily](https://app.tavily.com/) search engine. "
        "Includes sources in the response."
    )


class TavilyAnswer(BaseTool):
    type: Literal[AvailableTools.TAVILY_ANSWER] = AvailableTools.TAVILY_ANSWER
    name: Literal["Search (short answer, Tavily)"] = "Search (short answer, Tavily)"
    description: Literal[
        (
            "Uses the [Tavily](https://app.tavily.com/) search engine. "
            "This returns only the answer, no supporting evidence."
        )
    ] = (
        "Uses the [Tavily](https://app.tavily.com/) search engine. "
        "This returns only the answer, no supporting evidence."
    )


class Retrieval(BaseTool):
    type: Literal[AvailableTools.RETRIEVAL] = AvailableTools.RETRIEVAL
    name: Literal["Retrieval"] = "Retrieval"
    description: Literal[
        "Look up information in uploaded files."
    ] = "Look up information in uploaded files."


class DallE(BaseTool):
    type: Literal[AvailableTools.DALL_E] = AvailableTools.DALL_E
    name: Literal["Generate Image (Dall-E)"] = "Generate Image (Dall-E)"
    description: Literal[
        "Generates images from a text description using OpenAI's DALL-E model."
    ] = "Generates images from a text description using OpenAI's DALL-E model."


RETRIEVAL_DESCRIPTION = """Can be used to look up information that was uploaded to this assistant.
If the user is referencing particular files, that is often a good hint that information may be here.
If the user asks a vague question, they are likely meaning to look up info from this retriever, and you should call it!"""


def get_retriever(assistant_id: str, thread_id: str):
    return vstore.as_retriever(
        search_kwargs={"filter": {"namespace": {"$in": [assistant_id, thread_id]}}}
    )


@lru_cache(maxsize=5)
def get_retrieval_tool(assistant_id: str, thread_id: str, description: str):
    return create_retriever_tool(
        get_retriever(assistant_id, thread_id),
        "Retriever",
        description,
    )


@lru_cache(maxsize=1)
def _get_duck_duck_go():
    return DuckDuckGoSearchRun(args_schema=DDGInput)


@lru_cache(maxsize=1)
def _get_arxiv():
    return ArxivQueryRun(api_wrapper=ArxivAPIWrapper(), args_schema=ArxivInput)


@lru_cache(maxsize=1)
def _get_you_search():
    return create_retriever_tool(
        YouRetriever(n_hits=3, n_snippets_per_hit=3),
        "you_search",
        "Searches for documents using You.com",
    )


@lru_cache(maxsize=1)
def _get_sec_filings():
    return create_retriever_tool(
        KayAiRetriever.create(
            dataset_id="company", data_types=["10-K", "10-Q"], num_contexts=3
        ),
        "sec_filings_search",
        "Search for a query among SEC Filings",
    )


@lru_cache(maxsize=1)
def _get_press_releases():
    return create_retriever_tool(
        KayAiRetriever.create(
            dataset_id="company", data_types=["PressRelease"], num_contexts=6
        ),
        "press_release_search",
        "Search for a query among press releases from US companies",
    )


@lru_cache(maxsize=1)
def _get_pubmed():
    return create_retriever_tool(
        PubMedRetriever(), "pub_med_search", "Search for a query on PubMed"
    )


@lru_cache(maxsize=1)
def _get_wikipedia():
    return create_retriever_tool(
        WikipediaRetriever(), "wikipedia", "Search for a query on Wikipedia"
    )


@lru_cache(maxsize=1)
def _get_tavily():
    tavily_search = TavilySearchAPIWrapper()
    return TavilySearchResults(api_wrapper=tavily_search, name="search_tavily")


@lru_cache(maxsize=1)
def _get_tavily_answer():
    tavily_search = TavilySearchAPIWrapper()
    return _TavilyAnswer(api_wrapper=tavily_search, name="search_tavily_answer")


@lru_cache(maxsize=1)
def _get_connery_actions():
    connery_service = ConneryService()
    connery_toolkit = ConneryToolkit.create_instance(connery_service)
    tools = connery_toolkit.get_tools()
    return tools


@lru_cache(maxsize=1)
def _get_dalle_tools():
    return Tool(
        "Dall-E-Image-Generator",
        DallEAPIWrapper(size="1024x1024", quality="hd").run,
        "A wrapper around OpenAI DALL-E API. Useful for when you need to generate images from a text description. Input should be an image description.",
    )


TOOLS = {
    AvailableTools.CONNERY: _get_connery_actions,
    AvailableTools.DDG_SEARCH: _get_duck_duck_go,
    AvailableTools.ARXIV: _get_arxiv,
    AvailableTools.YOU_SEARCH: _get_you_search,
    AvailableTools.SEC_FILINGS: _get_sec_filings,
    AvailableTools.PRESS_RELEASES: _get_press_releases,
    AvailableTools.PUBMED: _get_pubmed,
    AvailableTools.TAVILY: _get_tavily,
    AvailableTools.WIKIPEDIA: _get_wikipedia,
    AvailableTools.TAVILY_ANSWER: _get_tavily_answer,
    AvailableTools.DALL_E: _get_dalle_tools,
}


================================================
FILE: backend/app/upload.py
================================================
"""API to deal with file uploads via a runnable.

For now this code assumes that the content is a base64 encoded string.

The details here might change in the future.

For the time being, upload and ingestion are coupled
"""

from __future__ import annotations

import mimetypes
import os
from typing import BinaryIO, List, Optional

from fastapi import UploadFile
from langchain_community.vectorstores.pgvector import PGVector
from langchain_core.document_loaders.blob_loaders import Blob
from langchain_core.runnables import (
    ConfigurableField,
    RunnableConfig,
    RunnableSerializable,
)
from langchain_core.vectorstores import VectorStore
from langchain_openai import AzureOpenAIEmbeddings, OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter, TextSplitter
from pydantic import ConfigDict

from app.ingest import ingest_blob
from app.parsing import MIMETYPE_BASED_PARSER


def _guess_mimetype(file_name: str, file_bytes: bytes) -> str:
    """Guess the mime-type of a file based on its name or bytes."""
    # Guess based on the file extension
    mime_type, _ = mimetypes.guess_type(file_name)

    # Return detected mime type from mimetypes guess, unless it's None
    if mime_type:
        return mime_type

    # Signature-based detection for common types
    if file_bytes.startswith(b"%PDF"):
        return "application/pdf"
    elif file_bytes.startswith(
        (b"\x50\x4B\x03\x04", b"\x50\x4B\x05\x06", b"\x50\x4B\x07\x08")
    ):
        return "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
    elif file_bytes.startswith(b"\xd0\xcf\x11\xe0\xa1\xb1\x1a\xe1"):
        return "application/msword"
    elif file_bytes.startswith(b"\x09\x00\xff\x00\x06\x00"):
        return "application/vnd.ms-excel"

    # Check for CSV-like plain text content (commas, tabs, newlines)
    try:
        decoded = file_bytes[:1024].decode("utf-8", errors="ignore")
        if all(char in decoded for char in (",", "\n")) or all(
            char in decoded for char in ("\t", "\n")
        ):
            return "text/csv"
        elif decoded.isprintable() or decoded == "":
            return "text/plain"
    except UnicodeDecodeError:
        pass

    return "application/octet-stream"


def convert_ingestion_input_to_blob(file: UploadFile) -> Blob:
    """Convert ingestion input to blob."""
    file_data = file.file.read()
    file_name = file.filename

    # Check if file_name is a valid string
    if not isinstance(file_name, str):
        raise TypeError(f"Expected string for file name, got {type(file_name)}")

    mimetype = _guess_mimetype(file_name, file_data)
    return Blob.from_data(
        data=file_data,
        path=file_name,
        mime_type=mimetype,
    )


def _determine_azure_or_openai_embeddings() -> PGVector:
    if os.environ.get("OPENAI_API_KEY"):
        return PGVector(
            connection_string=PG_CONNECTION_STRING,
            embedding_function=OpenAIEmbeddings(),
            use_jsonb=True,
        )
    if os.environ.get("AZURE_OPENAI_API_KEY"):
        return PGVector(
            connection_string=PG_CONNECTION_STRING,
            embedding_function=AzureOpenAIEmbeddings(
                azure_endpoint=os.environ.get("AZURE_OPENAI_API_BASE"),
                azure_deployment=os.environ.get(
                    "AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME"
                ),
                openai_api_version=os.environ.get("AZURE_OPENAI_API_VERSION"),
            ),
            use_jsonb=True,
        )
    raise ValueError(
        "Either OPENAI_API_KEY or AZURE_OPENAI_API_KEY needs to be set for embeddings to work."
    )


class IngestRunnable(RunnableSerializable[BinaryIO, List[str]]):
    """Runnable for ingesting files into a vectorstore."""

    text_splitter: TextSplitter
    vectorstore: VectorStore
    assistant_id: Optional[str] = None
    thread_id: Optional[str] = None
    """Ingested documents will be associated with assistant_id or thread_id.
    
    ID is used as the namespace, and is filtered on at query time.
    """

    model_config = ConfigDict(arbitrary_types_allowed=True)

    @property
    def namespace(self) -> str:
        if (self.assistant_id is None and self.thread_id is None) or (
            self.assistant_id is not None and self.thread_id is not None
        ):
            raise ValueError(
                "Exactly one of assistant_id or thread_id must be provided"
            )
        return self.assistant_id if self.assistant_id is not None else self.thread_id

    def invoke(self, blob: Blob, config: Optional[RunnableConfig] = None) -> List[str]:
        out = ingest_blob(
            blob,
            MIMETYPE_BASED_PARSER,
            self.text_splitter,
            self.vectorstore,
            self.namespace,
        )
        return out


PG_CONNECTION_STRING = PGVector.connection_string_from_db_params(
    driver="psycopg2",
    host=os.environ["POSTGRES_HOST"],
    port=int(os.environ["POSTGRES_PORT"]),
    database=os.environ["POSTGRES_DB"],
    user=os.environ["POSTGRES_USER"],
    password=os.environ["POSTGRES_PASSWORD"],
)
vstore = _determine_azure_or_openai_embeddings()


ingest_runnable = IngestRunnable(
    text_splitter=RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200),
    vectorstore=vstore,
).configurable_fields(
    assistant_id=ConfigurableField(
        id="assistant_id",
        annotation=Optional[str],
        name="Assistant ID",
    ),
    thread_id=ConfigurableField(
        id="thread_id",
        annotation=Optional[str],
        name="Thread ID",
    ),
)


================================================
FILE: backend/log_config.json
================================================
{
    "version": 1,
    "disable_existing_loggers": false,
    "formatters": {
        "default": {
            "()": "uvicorn.logging.DefaultFormatter",
            "fmt": "%(asctime)s - %(name)s - %(levelprefix)s %(message)s"
        },
        "access": {
            "()": "uvicorn.logging.AccessFormatter",
            "fmt": "%(asctime)s - %(name)s - %(levelprefix)s  %(client_addr)s - \"%(request_line)s\" %(status_code)s"
        },
        "json": {
            "()": "pythonjsonlogger.jsonlogger.JsonFormatter",
            "fmt": "%(asctime)s - %(name)s - %(levelname)s %(message)s"
        }
    },
    "handlers": {
        "default": {
            "formatter": "default",
            "class": "logging.StreamHandler",
            "stream": "ext://sys.stderr"
        },
        "access": {
            "formatter": "access",
            "class": "logging.StreamHandler",
            "stream": "ext://sys.stdout"
        },
        "file": {
            "formatter": "json",
            "class": "logging.handlers.RotatingFileHandler",
            "filename": "./app.log",
            "mode": "a+",
            "maxBytes": 10000000,
            "backupCount": 1
        }
    },
    "root": {
        "handlers": [
            "default",
            "file"
        ],
        "level": "INFO"
    },
    "loggers": {
        "app": {
            "handlers": [
                "default",
                "file"
            ],
            "level": "INFO",
            "propagate": false
        },
        "uvicorn": {
            "handlers": [
                "default",
                "file"
            ],
            "level": "INFO",
            "propagate": false
        },
        "uvicorn.access": {
            "handlers": [
                "access",
                "file"
            ],
            "level": "INFO",
            "propagate": false
        }
    }
}

================================================
FILE: backend/migrations/000001_create_extensions_and_first_tables.down.sql
================================================
DROP TABLE IF EXISTS thread;
DROP TABLE IF EXISTS assistant;
DROP TABLE IF EXISTS checkpoints;


================================================
FILE: backend/migrations/000001_create_extensions_and_first_tables.up.sql
================================================
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

CREATE TABLE IF NOT EXISTS assistant (
    assistant_id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    user_id VARCHAR(255) NOT NULL,
    name VARCHAR(255) NOT NULL,
    config JSON NOT NULL,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT (CURRENT_TIMESTAMP AT TIME ZONE 'UTC'),
    public BOOLEAN NOT NULL
);

CREATE TABLE IF NOT EXISTS thread (
    thread_id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    assistant_id UUID REFERENCES assistant(assistant_id) ON DELETE SET NULL,
    user_id VARCHAR(255) NOT NULL,
    name VARCHAR(255) NOT NULL,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT (CURRENT_TIMESTAMP AT TIME ZONE 'UTC')
);

CREATE TABLE IF NOT EXISTS checkpoints (
    thread_id TEXT PRIMARY KEY,
    checkpoint BYTEA
);

================================================
FILE: backend/migrations/000002_checkpoints_update_schema.down.sql
================================================
ALTER TABLE checkpoints
    DROP CONSTRAINT IF EXISTS checkpoints_pkey,
    ADD PRIMARY KEY (thread_id),
    DROP COLUMN IF EXISTS thread_ts,
    DROP COLUMN IF EXISTS parent_ts;


================================================
FILE: backend/migrations/000002_checkpoints_update_schema.up.sql
================================================
ALTER TABLE checkpoints
    ADD COLUMN IF NOT EXISTS thread_ts TIMESTAMPTZ,
    ADD COLUMN IF NOT EXISTS parent_ts TIMESTAMPTZ;

UPDATE checkpoints
    SET thread_ts = CURRENT_TIMESTAMP AT TIME ZONE 'UTC'
WHERE thread_ts IS NULL;

ALTER TABLE checkpoints
    DROP CONSTRAINT IF EXISTS checkpoints_pkey,
    ADD PRIMARY KEY (thread_id, thread_ts)


================================================
FILE: backend/migrations/000003_create_user.down.sql
================================================
ALTER TABLE assistant
    DROP CONSTRAINT fk_assistant_user_id,
    ALTER COLUMN user_id TYPE VARCHAR USING (user_id::text);

ALTER TABLE thread
    DROP CONSTRAINT fk_thread_user_id,
    ALTER COLUMN user_id TYPE VARCHAR USING (user_id::text);

DROP TABLE IF EXISTS "user";

================================================
FILE: backend/migrations/000003_create_user.up.sql
================================================
CREATE TABLE IF NOT EXISTS "user" (
    user_id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    sub VARCHAR(255) UNIQUE NOT NULL,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT (CURRENT_TIMESTAMP AT TIME ZONE 'UTC')
);

INSERT INTO "user" (user_id, sub)
SELECT DISTINCT user_id::uuid, user_id
FROM assistant
WHERE user_id IS NOT NULL
ON CONFLICT (user_id) DO NOTHING;

INSERT INTO "user" (user_id, sub)
SELECT DISTINCT user_id::uuid, user_id
FROM thread
WHERE user_id IS NOT NULL
ON CONFLICT (user_id) DO NOTHING;

ALTER TABLE assistant
    ALTER COLUMN user_id TYPE UUID USING (user_id::UUID),
    ADD CONSTRAINT fk_assistant_user_id FOREIGN KEY (user_id) REFERENCES "user"(user_id);

ALTER TABLE thread
    ALTER COLUMN user_id TYPE UUID USING (user_id::UUID),
    ADD CONSTRAINT fk_thread_user_id FOREIGN KEY (user_id) REFERENCES "user"(user_id);

================================================
FILE: backend/migrations/000004_add_metadata_to_thread.down.sql
================================================
ALTER TABLE thread
DROP COLUMN metadata;

================================================
FILE: backend/migrations/000004_add_metadata_to_thread.up.sql
================================================
ALTER TABLE thread
ADD COLUMN metadata JSONB;

UPDATE thread
SET metadata = json_build_object(
    'assistant_type', (SELECT config->'configurable'->>'type'
                 FROM assistant
                 WHERE assistant.assistant_id = thread.assistant_id)
);

================================================
FILE: backend/migrations/000005_advanced_checkpoints_schema.down.sql
================================================
-- Drop the blob storage table
DROP TABLE IF EXISTS checkpoint_blobs;

-- Drop the writes tracking table
DROP TABLE IF EXISTS checkpoint_writes;

-- Drop the new checkpoints table that was created by the application
DROP TABLE IF EXISTS checkpoints;

-- Restore the original checkpoints table by renaming old_checkpoints back
-- This preserves the original data that was saved before the migration
ALTER TABLE old_checkpoints RENAME TO checkpoints;

================================================
FILE: backend/migrations/000005_advanced_checkpoints_schema.up.sql
================================================
-- BREAKING CHANGE WARNING:
-- This migration represents a transition from pickle-based checkpointing to a new checkpoint system.
-- As a result, any threads created before this migration will not be usable/clickable in the UI.
-- old thread data remains in old_checkpoints table but cannot be accessed by the new version.

-- Rename existing checkpoints table to preserve current data
-- This is necessary because the application will create a new checkpoints table
-- with an updated schema during runtime initialization.
ALTER TABLE checkpoints RENAME TO old_checkpoints;

================================================
FILE: backend/pyproject.toml
================================================
[tool.poetry]
name = "opengpts"
version = "0.1.0"
description = ""
authors = ["Your Name <you@example.com>"]
readme = "README.md"
packages = [{include = "app"}]

[tool.poetry.dependencies]
python = "^3.9.0,<3.12"
sse-starlette = "^1.6.5"
tomli-w = "^1.0.0"
uvicorn = "^0.23.2"
fastapi = "^0.103.2"
# Uncomment if you need to work from a development branch
# This will only work for local development though!
# langchain = { git = "git@github.com:langchain-ai/langchain.git/", branch = "nc/subclass-runnable-binding" , subdirectory = "libs/langchain"}
orjson = "^3.9.10"
python-multipart = "^0.0.6"
tiktoken = "^0"
langchain = "^0.3"
langgraph = "0.2.45"
langgraph-checkpoint-postgres = "^2.0.2"
pydantic = "^2"
langchain-openai = "^0.2"
beautifulsoup4 = "^4.12.3"
boto3 = "^1.34.28"
duckduckgo-search = "^5.3.0"
arxiv = "^2.1.0"
kay = "^0.1.2"
xmltodict = "^0.13.0"
wikipedia = "^1.4.0"
langchain-google-vertexai = "^2.0"
langchain-google-community = "^2.0.1"
setuptools = "^69.0.3"
pdfminer-six = "^20231228"
fireworks-ai = "^0.11.2"
httpx = { version = "^0", extras = ["socks"] }
unstructured = {extras = ["doc", "docx"], version = "^0"}
pgvector = "^0.2.5"
psycopg2-binary = "^2.9.9"
asyncpg = "^0.29.0"
langchain-core = "^0.3"
pyjwt = {extras = ["crypto"], version = "^2.8.0"}
langchain-anthropic = "^0.2"
structlog = "^24.1.0"
python-json-logger = "^2.0.7"

[tool.poetry.group.dev.dependencies]
uvicorn = "^0.23.2"
pygithub = "^2.1.1"

[tool.poetry.group.lint.dependencies]
ruff = "^0.1.4"
codespell = "^2.2.0"

[tool.poetry.group.test.dependencies]
pytest = "^7.2.1"
pytest-cov = "^4.0.0"
pytest-asyncio = "^0.21.1"
pytest-mock = "^3.11.1"
pytest-socket = "^0.6.0"
pytest-watch = "^4.2.0"
pytest-timeout = "^2.2.0"

[tool.coverage.run]
omit = [
    "tests/*",
]

[tool.pytest.ini_options]
# --strict-markers will raise errors on unknown marks.
# https://docs.pytest.org/en/7.1.x/how-to/mark.html#raising-errors-on-unknown-marks
#
# https://docs.pytest.org/en/7.1.x/reference/reference.html
# --strict-config       any warnings encountered while parsing the `pytest`
#                       section of the configuration file raise errors.
addopts = "--strict-markers --strict-config --durations=5 -vv"
# Use global timeout of 30 seconds for now.
# Most tests should be closer to ~100 ms, but some of the tests involve
# parsing files. We can adjust on a per test basis later on.
timeout = 30
asyncio_mode = "auto"


[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"


================================================
FILE: backend/tests/__init__.py
================================================


================================================
FILE: backend/tests/unit_tests/__init__.py
================================================


================================================
FILE: backend/tests/unit_tests/agent_executor/__init__.py
================================================


================================================
FILE: backend/tests/unit_tests/agent_executor/test_parsing.py
================================================
"""Test parsing logic."""
import mimetypes

from langchain_community.document_loaders import Blob

from app.parsing import MIMETYPE_BASED_PARSER, SUPPORTED_MIMETYPES
from tests.unit_tests.fixtures import get_sample_paths


def test_list_of_supported_mimetypes() -> None:
    """This list should generally grow! Protecting against typos in mimetypes."""
    assert SUPPORTED_MIMETYPES == [
        "application/msword",
        "application/pdf",
        "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
        "text/html",
        "text/plain",
    ]


def test_attempt_to_parse_each_fixture() -> None:
    """Attempt to parse supported fixtures."""
    seen_mimetypes = set()
    for path in get_sample_paths():
        type_, _ = mimetypes.guess_type(path)
        if type_ not in SUPPORTED_MIMETYPES:
            continue
        seen_mimetypes.add(type_)
        blob = Blob.from_path(path)
        documents = MIMETYPE_BASED_PARSER.parse(blob)
        try:
            assert len(documents) == 1
            doc = documents[0]
            assert "source" in doc.metadata
            assert doc.metadata["source"] == str(path)
            assert "🦜" in doc.page_content
        except Exception as e:
            raise AssertionError(f"Failed to parse {path}") from e

    known_missing = {"application/msword"}
    assert set(SUPPORTED_MIMETYPES) - known_missing == seen_mimetypes


================================================
FILE: backend/tests/unit_tests/agent_executor/test_upload.py
================================================
from io import BytesIO

from fastapi import UploadFile
from langchain.text_splitter import RecursiveCharacterTextSplitter

from app.upload import IngestRunnable, _guess_mimetype, convert_ingestion_input_to_blob
from tests.unit_tests.fixtures import get_sample_paths
from tests.unit_tests.utils import InMemoryVectorStore


def test_ingestion_runnable() -> None:
    """Test ingestion runnable"""
    vectorstore = InMemoryVectorStore()
    splitter = RecursiveCharacterTextSplitter()
    runnable = IngestRunnable(
        text_splitter=splitter,
        vectorstore=vectorstore,
        input_key="file_contents",
        assistant_id="TheParrot",
    )
    # Simulate file data
    file_data = BytesIO(b"test data")
    file_data.seek(0)
    # Create UploadFile object
    file = UploadFile(filename="testfile.txt", file=file_data)

    # Convert the file to blob
    blob = convert_ingestion_input_to_blob(file)
    ids = runnable.invoke(blob)
    assert len(ids) == 1


def test_mimetype_guessing() -> None:
    """Verify mimetype guessing for all fixtures."""
    name_to_mime = {}
    for file in sorted(get_sample_paths()):
        data = file.read_bytes()
        name_to_mime[file.name] = _guess_mimetype(file.name, data)

    assert {
        "sample.docx": (
            "application/vnd.openxmlformats-officedocument.wordprocessingml.document"
        ),
        "sample.epub": "application/epub+zip",
        "sample.html": "text/html",
        "sample.odt": "application/vnd.oasis.opendocument.text",
        "sample.pdf": "application/pdf",
        "sample.rtf": "application/rtf",
        "sample.txt": "text/plain",
    } == name_to_mime


================================================
FILE: backend/tests/unit_tests/app/__init__.py
================================================


================================================
FILE: backend/tests/unit_tests/app/helpers.py
================================================
from contextlib import asynccontextmanager

from httpx import AsyncClient
from typing_extensions import AsyncGenerator


@asynccontextmanager
async def get_client() -> AsyncGenerator[AsyncClient, None]:
    """Get the app."""
    from app.server import app

    async with AsyncClient(app=app, base_url="http://test") as ac:
        yield ac


================================================
FILE: backend/tests/unit_tests/app/test_app.py
================================================
"""Test the server and client together."""

from typing import Optional, Sequence
from uuid import uuid4

import asyncpg
from pydantic import BaseModel

from app.schema import Assistant, Thread
from tests.unit_tests.app.helpers import get_client


def _project(model: BaseModel, *, exclude_keys: Optional[Sequence[str]] = None) -> dict:
    """Return a dict with only the keys specified."""
    d = model.model_dump()
    _exclude = set(exclude_keys) if exclude_keys else set()
    return {k: v for k, v in d.items() if k not in _exclude}


async def test_list_and_create_assistants(pool: asyncpg.pool.Pool) -> None:
    """Test list and create assistants."""
    headers = {"Cookie": "opengpts_user_id=1"}
    aid = str(uuid4())

    async with pool.acquire() as conn:
        assert len(await conn.fetch("SELECT * FROM assistant;")) == 0

    async with get_client() as client:
        response = await client.get(
            "/assistants/",
            headers=headers,
        )
        assert response.status_code == 200

        assert response.json() == []

        # Create an assistant
        response = await client.put(
            f"/assistants/{aid}",
            json={"name": "bobby", "config": {}, "public": False},
            headers=headers,
        )
        assert response.status_code == 200
        assistant = Assistant.model_validate(response.json())
        assert _project(assistant, exclude_keys=["updated_at", "user_id"]) == {
            "assistant_id": aid,
            "config": {},
            "name": "bobby",
            "public": False,
        }
        async with pool.acquire() as conn:
            assert len(await conn.fetch("SELECT * FROM assistant;")) == 1

        response = await client.get("/assistants/", headers=headers)
        assistants = [Assistant.model_validate(d) for d in response.json()]
        assert [
            _project(d, exclude_keys=["updated_at", "user_id"]) for d in assistants
        ] == [
            {
                "assistant_id": aid,
                "config": {},
                "name": "bobby",
                "public": False,
            }
        ]

        response = await client.put(
            f"/assistants/{aid}",
            json={"name": "bobby", "config": {}, "public": False},
            headers=headers,
        )

        assistant = Assistant.model_validate(response.json())
        assert _project(assistant, exclude_keys=["updated_at", "user_id"]) == {
            "assistant_id": aid,
            "config": {},
            "name": "bobby",
            "public": False,
        }

        # Check not visible to other users
        headers = {"Cookie": "opengpts_user_id=2"}
        response = await client.get("/assistants/", headers=headers)
        assert response.status_code == 200, response.text
        assert response.json() == []


async def test_threads(pool: asyncpg.pool.Pool) -> None:
    """Test put thread."""
    headers = {"Cookie": "opengpts_user_id=1"}
    aid = str(uuid4())
    tid = str(uuid4())

    async with get_client() as client:
        response = await client.put(
            f"/assistants/{aid}",
            json={
                "name": "assistant",
                "config": {"configurable": {"type": "chatbot"}},
                "public": False,
            },
            headers=headers,
        )

        response = await client.put(
            f"/threads/{tid}",
            json={"name": "bobby", "assistant_id": aid},
            headers=headers,
        )
        assert response.status_code == 200, response.text
        _ = Thread.model_validate(response.json())

        response = await client.get(f"/threads/{tid}/state", headers=headers)
        assert response.status_code == 200
        assert response.json() == {"values": None, "next": []}

        response = await client.get("/threads/", headers=headers)

        assert response.status_code == 200
        threads = [Thread.model_validate(d) for d in response.json()]
        assert [
            _project(d, exclude_keys=["updated_at", "user_id"]) for d in threads
        ] == [
            {
                "assistant_id": aid,
                "name": "bobby",
                "thread_id": tid,
                "metadata": {"assistant_type": "chatbot"},
            }
        ]

        response = await client.put(
            f"/threads/{tid}",
            headers={"Cookie": "opengpts_user_id=2"},
        )
        assert response.status_code == 422


================================================
FILE: backend/tests/unit_tests/app/test_auth.py
================================================
from base64 import b64encode
from datetime import datetime, timedelta, timezone
from typing import Optional
from unittest.mock import MagicMock, patch

import jwt

from app.auth.handlers import AuthedUser, get_auth_handler
from app.auth.settings import (
    AuthType,
    JWTSettingsLocal,
    JWTSettingsOIDC,
)
from app.auth.settings import (
    settings as auth_settings,
)
from app.server import app
from tests.unit_tests.app.helpers import get_client


@app.get("/me")
async def me(user: AuthedUser) -> dict:
    return user.model_dump()


def _create_jwt(
    key: str, alg: str, payload: dict, headers: Optional[dict] = None
) -> str:
    return jwt.encode(payload, key, algorithm=alg, headers=headers)


async def test_noop():
    get_auth_handler.cache_clear()
    auth_settings.auth_type = AuthType.NOOP
    sub = "user_noop"

    async with get_client() as client:
        response = await client.get("/me", cookies={"opengpts_user_id": sub})
        assert response.status_code == 200
        assert response.json()["sub"] == sub


async def test_jwt_local():
    get_auth_handler.cache_clear()
    auth_settings.auth_type = AuthType.JWT_LOCAL
    key = "key"
    auth_settings.jwt_local = JWTSettingsLocal(
        alg="HS256",
        iss="issuer",
        aud="audience",
        decode_key_b64=b64encode(key.encode("utf-8")),
    )
    sub = "user_jwt_local"

    token = _create_jwt(
        key=key,
        alg=auth_settings.jwt_local.alg,
        payload={
            "sub": sub,
            "iss": auth_settings.jwt_local.iss,
            "aud": auth_settings.jwt_local.aud,
            "exp": datetime.now(timezone.utc) + timedelta(days=1),
        },
    )

    async with get_client() as client:
        response = await client.get("/me", headers={"Authorization": f"Bearer {token}"})
        assert response.status_code == 200
        assert response.json()["sub"] == sub

    # Test invalid token
    async with get_client() as client:
        response = await client.get("/me", headers={"Authorization": "Bearer xyz"})
        assert response.status_code == 401


async def test_jwt_oidc():
    get_auth_handler.cache_clear()
    auth_settings.auth_type = AuthType.JWT_OIDC
    auth_settings.jwt_oidc = JWTSettingsOIDC(iss="issuer", aud="audience")
    sub = "user_jwt_oidc"
    key = "key"
    alg = "HS256"

    token = _create_jwt(
        key=key,
        alg=alg,
        payload={
            "sub": sub,
            "iss": auth_settings.jwt_oidc.iss,
            "aud": auth_settings.jwt_oidc.aud,
            "exp": datetime.now(timezone.utc) + timedelta(days=1),
        },
        headers={"kid": "kid", "alg": alg},
    )

    mock_jwk_client = MagicMock()
    mock_jwk_client.get_signing_key.return_value = MagicMock(key=key)

    with patch(
        "app.auth.handlers.JWTAuthOIDC._get_jwk_client", return_value=mock_jwk_client
    ):
        async with get_client() as client:
            response = await client.get(
                "/me", headers={"Authorization": f"Bearer {token}"}
            )
            assert response.status_code == 200
            assert response.json()["sub"] == sub


================================================
FILE: backend/tests/unit_tests/conftest.py
================================================
import asyncio
import os
import subprocess

import asyncpg
import pytest

from app.auth.settings import AuthType
from app.auth.settings import settings as auth_settings
from app.lifespan import get_pg_pool, lifespan
from app.server import app

auth_settings.auth_type = AuthType.NOOP

# Temporary handling of environment variables for testing
os.environ["OPENAI_API_KEY"] = "test"

TEST_DB = "test"
assert os.environ["POSTGRES_DB"] != TEST_DB, "Test and main database conflict."
os.environ["POSTGRES_DB"] = TEST_DB


async def _get_conn() -> asyncpg.Connection:
    return await asyncpg.connect(
        user=os.environ["POSTGRES_USER"],
        password=os.environ["POSTGRES_PASSWORD"],
        host=os.environ["POSTGRES_HOST"],
        port=os.environ["POSTGRES_PORT"],
        database="postgres",
    )


async def _create_test_db() -> None:
    """Check if the test database exists and create it if it doesn't."""
    conn = await _get_conn()
    exists = await conn.fetchval("SELECT 1 FROM pg_database WHERE datname=$1", TEST_DB)
    if not exists:
        await conn.execute(f'CREATE DATABASE "{TEST_DB}"')
    await conn.close()


async def _drop_test_db() -> None:
    """Check if the test database exists and if so, drop it."""
    conn = await _get_conn()
    exists = await conn.fetchval("SELECT 1 FROM pg_database WHERE datname=$1", TEST_DB)
    if exists:
        await conn.execute(f'DROP DATABASE "{TEST_DB}" WITH (FORCE)')
    await conn.close()


def _migrate_test_db() -> None:
    subprocess.run(["make", "migrate"], check=True)


@pytest.fixture(scope="session")
async def _init_db():
    """Initialize the test database."""
    await _drop_test_db()  # In case previous test session was abruptly terminated
    await _create_test_db()
    _migrate_test_db()


@pytest.fixture(scope="session")
async def pool(_init_db):
    """Initialize database pool with checkpointer."""
    async with lifespan(app):
        yield get_pg_pool()
    await _drop_test_db()


@pytest.fixture(scope="function", autouse=True)
async def clear_test_db(pool):
    """Truncate all tables before each test."""
    async with pool.acquire() as conn:
        query = """
        DO
        $$
        DECLARE
        r RECORD;
        BEGIN
        FOR r IN (SELECT tablename FROM pg_tables WHERE schemaname = 'public') LOOP
            EXECUTE 'TRUNCATE TABLE ' || quote_ident(r.tablename) || ' CASCADE;';
        END LOOP;
        END
        $$;
        """
        await conn.execute(query)


@pytest.fixture(scope="session")
def event_loop(request):
    loop = asyncio.get_event_loop_policy().new_event_loop()
    yield loop
    loop.close()


================================================
FILE: backend/tests/unit_tests/fixtures/__init__.py
================================================
from pathlib import Path
from typing import List

HERE = Path(__file__).parent

# PUBLIC API


def get_sample_paths() -> List[Path]:
    """List all fixtures."""
    return list(HERE.glob("sample.*"))


================================================
FILE: backend/tests/unit_tests/fixtures/sample.html
================================================
<html><head><meta content="text/html; charset=UTF-8" http-equiv="content-type"><style type="text/css">.lst-kix_n6n0tzfwn8i8-5>li:before{content:"\0025a0   "}.lst-kix_n6n0tzfwn8i8-6>li:before{content:"\0025cf   "}ul.lst-kix_n6n0tzfwn8i8-8{list-style-type:none}ul.lst-kix_n6n0tzfwn8i8-7{list-style-type:none}.lst-kix_n6n0tzfwn8i8-3>li:before{content:"\0025cf   "}.lst-kix_n6n0tzfwn8i8-4>li:before{content:"\0025cb   "}.lst-kix_n6n0tzfwn8i8-7>li:before{content:"\0025cb   "}.lst-kix_n6n0tzfwn8i8-8>li:before{content:"\0025a0   "}.lst-kix_n6n0tzfwn8i8-1>li:before{content:"\0025cb   "}.lst-kix_n6n0tzfwn8i8-2>li:before{content:"\0025a0   "}li.li-bullet-0:before{margin-left:-18pt;white-space:nowrap;display:inline-block;min-width:18pt}.lst-kix_n6n0tzfwn8i8-0>li:before{content:"\0025cf   "}ul.lst-kix_n6n0tzfwn8i8-2{list-style-type:none}ul.lst-kix_n6n0tzfwn8i8-1{list-style-type:none}ul.lst-kix_n6n0tzfwn8i8-0{list-style-type:none}ul.lst-kix_n6n0tzfwn8i8-6{list-style-type:none}ul.lst-kix_n6n0tzfwn8i8-5{list-style-type:none}ul.lst-kix_n6n0tzfwn8i8-4{list-style-type:none}ul.lst-kix_n6n0tzfwn8i8-3{list-style-type:none}ol{margin:0;padding:0}table td,table th{padding:0}.c6{border-right-style:solid;padding:5pt 5pt 5pt 5pt;border-bottom-color:#000000;border-top-width:1pt;border-right-width:1pt;border-left-color:#000000;vertical-align:top;border-right-color:#000000;border-left-width:1pt;border-top-style:solid;border-left-style:solid;border-bottom-width:1pt;width:156pt;border-top-color:#000000;border-bottom-style:solid}.c0{-webkit-text-decoration-skip:none;color:#000000;font-weight:400;text-decoration:underline;vertical-align:baseline;text-decoration-skip-ink:none;font-size:11pt;font-family:"Arial";font-style:normal}.c4{padding-top:0pt;padding-bottom:0pt;line-height:1.0;orphans:2;widows:2;text-align:left;height:11pt}.c11{color:#000000;font-weight:400;text-decoration:none;vertical-align:baseline;font-size:11pt;font-family:"Arial";font-style:italic}.c3{color:#000000;font-weight:400;text-decoration:none;vertical-align:baseline;font-size:11pt;font-family:"Arial";font-style:normal}.c12{color:#000000;font-weight:700;text-decoration:none;vertical-align:baseline;font-size:11pt;font-family:"Arial";font-style:normal}.c7{padding-top:0pt;padding-bottom:0pt;line-height:1.0;orphans:2;widows:2;text-align:left}.c1{padding-top:0pt;padding-bottom:0pt;line-height:1.15;orphans:2;widows:2;text-align:left}.c8{text-decoration-skip-ink:none;-webkit-text-decoration-skip:none;color:#1155cc;text-decoration:underline}.c14{border-spacing:0;border-collapse:collapse;margin-right:auto}.c13{background-color:#ffffff;max-width:468pt;padding:72pt 72pt 72pt 72pt}.c15{padding:0;margin:0}.c10{margin-left:36pt;padding-left:0pt}.c5{color:inherit;text-decoration:inherit}.c9{height:11pt}.c2{height:0pt}.title{padding-top:0pt;color:#000000;font-size:26pt;padding-bottom:3pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}.subtitle{padding-top:0pt;color:#666666;font-size:15pt;padding-bottom:16pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}li{color:#000000;font-size:11pt;font-family:"Arial"}p{margin:0;color:#000000;font-size:11pt;font-family:"Arial"}h1{padding-top:20pt;color:#000000;font-size:20pt;padding-bottom:6pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h2{padding-top:18pt;color:#000000;font-size:16pt;padding-bottom:6pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h3{padding-top:16pt;color:#434343;font-size:14pt;padding-bottom:4pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h4{padding-top:14pt;color:#666666;font-size:12pt;padding-bottom:4pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h5{padding-top:12pt;color:#666666;font-size:11pt;padding-bottom:4pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h6{padding-top:12pt;color:#666666;font-size:11pt;padding-bottom:4pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;font-style:italic;orphans:2;widows:2;text-align:left}</style></head><body class="c13 doc-content"><p class="c1"><span class="c3">🦜️ LangChain</span></p><p class="c1 c9"><span class="c3"></span></p><p class="c1 c9"><span class="c3"></span></p><p class="c1"><span class="c0">Underline</span></p><p class="c1 c9"><span class="c0"></span></p><p class="c1"><span class="c12">Bold</span></p><p class="c1 c9"><span class="c12"></span></p><p class="c1"><span class="c11">Italics</span></p><p class="c1 c9"><span class="c11"></span></p><p class="c1 c9"><span class="c11"></span></p><a id="t.e89270b97fc18eabe5c666cba79cd82cff5b5c3d"></a><a id="t.0"></a><table class="c14"><tbody><tr class="c2"><td class="c6" colspan="1" rowspan="1"><p class="c4"><span class="c12"></span></p></td><td class="c6" colspan="1" rowspan="1"><p class="c7"><span class="c12">Col 1</span></p></td><td class="c6" colspan="1" rowspan="1"><p class="c7"><span class="c12">Col 2</span></p></td></tr><tr class="c2"><td class="c6" colspan="1" rowspan="1"><p class="c7"><span class="c12">Row 1</span></p></td><td class="c6" colspan="1" rowspan="1"><p class="c7"><span class="c3">1</span></p></td><td class="c6" colspan="1" rowspan="1"><p class="c7"><span class="c3">2</span></p></td></tr><tr class="c2"><td class="c6" colspan="1" rowspan="1"><p class="c7"><span class="c12">Row 2</span></p></td><td class="c6" colspan="1" rowspan="1"><p class="c7"><span class="c3">3</span></p></td><td class="c6" colspan="1" rowspan="1"><p class="c7"><span class="c3">4</span></p></td></tr></tbody></table><p class="c1 c9"><span class="c3"></span></p><p class="c1 c9"><span class="c3"></span></p><p class="c1"><span>Link: </span><span class="c8"><a class="c5" href="https://www.google.com/url?q=https://www.langchain.com/&amp;sa=D&amp;source=editors&amp;ust=1699572948600868&amp;usg=AOvVaw2T4jvAmPuMvcyed6PrEjq1">https://www.langchain.com/</a></span></p><p class="c1 c9"><span class="c3"></span></p><p class="c1 c9"><span class="c3"></span></p><ul class="c15 lst-kix_n6n0tzfwn8i8-0 start"><li class="c1 c10 li-bullet-0"><span class="c3">Item 1</span></li><li class="c1 c10 li-bullet-0"><span class="c3">Item 2</span></li><li class="c1 c10 li-bullet-0"><span class="c3">Item 3</span></li><li class="c1 c10 li-bullet-0"><span class="c3">We also love cats 🐱</span></li></ul><p class="c1 c9"><span class="c3"></span></p><p class="c1"><span class="c3">Image</span></p><p class="c1 c9"><span class="c3"></span></p><p class="c1"><span style="overflow: hidden; display: inline-block; margin: 0.00px 0.00px; border: 0.00px solid #000000; transform: rotate(0.00rad) translateZ(0px); -webkit-transform: rotate(0.00rad) translateZ(0px); width: 624.00px; height: 132.00px;"><img alt="" src="sample_files/image1.png" style="width: 624.00px; height: 132.00px; margin-left: 0.00px; margin-top: 0.00px; transform: rotate(0.00rad) translateZ(0px); -webkit-transform: rotate(0.00rad) translateZ(0px);" title=""></span></p><p class="c1 c9"><span class="c3"></span></p><p class="c1 c9"><span class="c3"></span></p><p class="c1 c9"><span class="c3"></span></p><p class="c1 c9"><span class="c3"></span></p></body></html>

================================================
FILE: backend/tests/unit_tests/fixtures/sample.rtf
================================================
{\rtf1\ansi\ansicpg1252\uc0\stshfdbch0\stshfloch0\stshfhich0\stshfbi0\deff0\adeff0{\fonttbl{\f0\froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f1\froman\fcharset2\fprq2{\*\panose 05050102010706020507}Symbol;}{\f2\fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial;}}{\colortbl;\red0\green0\blue0;\red17\green85\blue204;
\red67\green67\blue67;\red102\green102\blue102;}{\stylesheet{\s0\snext0\sqformat\spriority0\fi0\sb0\sa0\aspalpha\aspnum\adjustright\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl276\slmult1\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1 Normal;}{\s1\sbasedon0\snext0\styrsid15694742
\sqformat\spriority0\keep\keepn\fi0\sb400\sa120\aspalpha\aspnum\adjustright\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl240\slmult1\rtlch\ab0\ai0\af2\afs40\ltrch\b0\i0\fs40\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1 heading 1;}{\s2\sbasedon0\snext0\styrsid15694742
\sqformat\spriority0\keep\keepn\fi0\sb360\sa120\aspalpha\aspnum\adjustright\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl240\slmult1\rtlch\ab0\ai0\af2\afs32\ltrch\b0\i0\fs32\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1 heading 2;}{\s3\sbasedon0\snext0\styrsid15694742
\sqformat\spriority0\keep\keepn\fi0\sb320\sa80\aspalpha\aspnum\adjustright\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl240\slmult1\rtlch\ab0\ai0\af2\afs28\ltrch\b0\i0\fs28\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf3 heading 3;}{\s4\sbasedon0\snext0\styrsid15694742
\sqformat\spriority0\keep\keepn\fi0\sb280\sa80\aspalpha\aspnum\adjustright\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl240\slmult1\rtlch\ab0\ai0\af2\afs24\ltrch\b0\i0\fs24\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf4 heading 4;}{\s5\sbasedon0\snext0\styrsid15694742
\sqformat\spriority0\keep\keepn\fi0\sb240\sa80\aspalpha\aspnum\adjustright\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl240\slmult1\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf4 heading 5;}{\s6\sbasedon0\snext0\styrsid15694742
\sqformat\spriority0\keep\keepn\fi0\sb240\sa80\aspalpha\aspnum\adjustright\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl240\slmult1\rtlch\ab0\ai\af2\afs22\ltrch\b0\i\fs22\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf4 heading 6;}{\*\cs10\additive\ssemihidden\spriority0 Default Paragraph Font;
}{\*\ts11\tsrowd\snext11\ssemihidden\spriority0\aspalpha\aspnum\adjustright\ltrpar\li0\lin0\ri0\rin0\ql\faauto\tsvertalt\tsbrdrl\tsbrdrr\tsbrdrt\tsbrdrb\tsbrdrdgr\tsbrdrdgl\tsbrdrh\tsbrdrv\trpaddl108\trpaddfl3\trwWidthB0\trftsWidthB3\trpaddt0\trpaddft3\trpaddb0
\trpaddfb3\trpaddr108\trpaddfr3 Normal Table;}{\s15\sbasedon0\snext15\styrsid15694742\sqformat\spriority0\keep\keepn\fi0\sb0\sa60\aspalpha\aspnum\adjustright\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl240\slmult1\rtlch\ab0\ai0\af2\afs52\ltrch\b0\i0\fs52\loch\af2
\dbch\af2\hich\f2\strike0\ulnone\cf1 Title;}{\s16\sbasedon0\snext16\styrsid15694742\sqformat\spriority0\keep\keepn\fi0\sb0\sa320\aspalpha\aspnum\adjustright\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl240\slmult1\rtlch\ab0\ai0\af2\afs30\ltrch\b0\i0\fs30
\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf4 Subtitle;}}{\*\listtable{\list\listtemplateid1{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelstartat1{\leveltext \'01\u9679 ;}{\levelnumbers;}\levelfollow0\ulnone\jclisttab\tx360\fi-360\li720\lin720}{
\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelstartat1{\leveltext \'01\u9675 ;}{\levelnumbers;}\levelfollow0\ulnone\jclisttab\tx1080\fi-360\li1440\lin1440}{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelstartat1{\leveltext \'01\u9632 ;}{
\levelnumbers;}\levelfollow0\ulnone\jclisttab\tx1800\fi-180\li2160\lin2160}{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelstartat1{\leveltext \'01\u9679 ;}{\levelnumbers;}\levelfollow0\ulnone\jclisttab\tx2520\fi-360\li2880\lin2880}{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelstartat1
{\leveltext \'01\u9675 ;}{\levelnumbers;}\levelfollow0\ulnone\jclisttab\tx3240\fi-360\li3600\lin3600}{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelstartat1{\leveltext \'01\u9632 ;}{\levelnumbers;}\levelfollow0\ulnone\jclisttab\tx3960\fi-180\li4320\lin4320}
{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelstartat1{\leveltext \'01\u9679 ;}{\levelnumbers;}\levelfollow0\ulnone\jclisttab\tx4680\fi-360\li5040\lin5040}{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelstartat1{\leveltext \'01\u9675 ;
}{\levelnumbers;}\levelfollow0\ulnone\jclisttab\tx5400\fi-360\li5760\lin5760}{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelstartat1{\leveltext \'01\u9632 ;}{\levelnumbers;}\levelfollow0\ulnone\jclisttab\tx6120\fi-180\li6480\lin6480}\listid1}}
{\*\listoverridetable{\listoverride\listid1\listoverridecount0\ls1}}{\*\rsidtbl\rsid10976062\rsid13249109}{\*\generator Aspose.Words for Java 23.4.0;}{\info\version1\edmins0\nofpages1\nofwords0\nofchars0\nofcharsws0}\paperw12240\paperh15840\margl1440\margr1440\margt1440\margb1440\gutter0
{\mmathPr\mbrkBin0\mbrkBinSub0\mdefJc1\mdispDef1\minterSp0\mintLim0\mintraSp0\mlMargin0\mmathFont0\mnaryLim1\mpostSp0\mpreSp0\mrMargin0\msmallFrac0\mwrapIndent1440\mwrapRight0}\deflang1033\deflangfe2052\adeflang1025\jexpand\showxmlerrors1\validatexml1{
\*\wgrffmtfilter 013f}\viewkind1\viewscale100\fet0\ftnbj\aenddoc\ftnrstcont\aftnrstcont\ftnnar\aftnnrlc\widowctrl\nospaceforul\nolnhtadjtbl\alntblind\lyttblrtgr\dntblnsbdb\noxlattoyen\wrppunct\nobrkwrptbl\expshrtn\snaptogridincell\asianbrkrule\htmautsp\noultrlspc
\useltbaln\splytwnine\ftnlytwnine\lytcalctblwd\allowfieldendsel\lnbrkrule\nouicompat\nofeaturethrottle1\utinl\formshade\nojkernpunct\dghspace180\dgvspace180\dghorigin1800\dgvorigin1440\dghshow1\dgvshow1\dgmargin\pgbrdrhead\pgbrdrfoot\rsidroot10976062\sectd\sectlinegrid360\pgwsxn12240\pghsxn15840\marglsxn1440\margrsxn1440\margtsxn1440\margbsxn1440\guttersxn0\headery720\footery720\colsx720\ltrsect\pard\plain\itap0\s0\ilvl0\fi0\sb0\sa0\aspalpha\aspnum\adjustright\brdrt\brdrl\brdrb\brdrr\brdrbtw\brdrbar
\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl276\slmult1\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1{\rtlch\ab0\ai0\af2\alang1025\afs22\ltrch\b0\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033\loch\af2\dbch\af2
\hich\f2\strike0\ulnone\cf1 \u-10178 \u-8804 \u-497  }{\rtlch\ab0\ai0\af2\alang1025\afs22\ltrch\b0\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1 LangChain}{\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22\loch\af2
\dbch\af2\hich\f2\insrsid10976062\strike0\ulnone\cf1\par}\pard\plain\itap0\s0\ilvl0\fi0\sb0\sa0\aspalpha\aspnum\adjustright\brdrt\brdrl\brdrb\brdrr\brdrbtw\brdrbar\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl276\slmult1\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22
\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1{\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22\loch\af2\dbch\af2\hich\f2\insrsid10976062\strike0\ulnone\cf1\par}\pard\plain\itap0\s0\ilvl0\fi0\sb0\sa0\aspalpha\aspnum\adjustright\brdrt\brdrl\brdrb\brdrr\brdrbtw
\brdrbar\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl276\slmult1\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1{\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22\loch\af2\dbch\af2\hich\f2\insrsid10976062\strike0\ulnone\cf1\par}
\pard\plain\itap0\s0\ilvl0\fi0\sb0\sa0\aspalpha\aspnum\adjustright\brdrt\brdrl\brdrb\brdrr\brdrbtw\brdrbar\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl276\slmult1\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1{\rtlch\ab0\ai0\af2\alang1025\afs22
\ltrch\b0\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033\loch\af2\dbch\af2\hich\f2\strike0\ul\cf1 Underline}{\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22\loch\af2\dbch\af2\hich\f2\insrsid10976062\strike0\ul\cf1\par}\pard\plain\itap0\s0\ilvl0\fi0\sb0\sa0
\aspalpha\aspnum\adjustright\brdrt\brdrl\brdrb\brdrr\brdrbtw\brdrbar\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl276\slmult1\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1{\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22\loch\af2
\dbch\af2\hich\f2\insrsid10976062\strike0\ul\cf1\par}\pard\plain\itap0\s0\ilvl0\fi0\sb0\sa0\aspalpha\aspnum\adjustright\brdrt\brdrl\brdrb\brdrr\brdrbtw\brdrbar\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl276\slmult1\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22
\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1{\rtlch\ab\ai0\af2\alang1025\afs22\ltrch\b\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1 Bold}{\rtlch\ab\ai0\af2\afs22\ltrch\b\i0\fs22\loch\af2\dbch\af2\hich\f2\insrsid10976062\strike0
\ulnone\cf1\par}\pard\plain\itap0\s0\ilvl0\fi0\sb0\sa0\aspalpha\aspnum\adjustright\brdrt\brdrl\brdrb\brdrr\brdrbtw\brdrbar\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl276\slmult1\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22\loch\af2\dbch\af2\hich\f2\strike0
\ulnone\cf1{\rtlch\ab\ai0\af2\afs22\ltrch\b\i0\fs22\loch\af2\dbch\af2\hich\f2\insrsid10976062\strike0\ulnone\cf1\par}\pard\plain\itap0\s0\ilvl0\fi0\sb0\sa0\aspalpha\aspnum\adjustright\brdrt\brdrl\brdrb\brdrr\brdrbtw\brdrbar\widctlpar\ltrpar\li0\lin0\ri0\rin0
\ql\faauto\sl276\slmult1\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1{\rtlch\ab0\ai\af2\alang1025\afs22\ltrch\b0\i\fs22\lang1033\langnp1033\langfe1033\langfenp1033\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1 Italics}
{\rtlch\ab0\ai\af2\afs22\ltrch\b0\i\fs22\loch\af2\dbch\af2\hich\f2\insrsid10976062\strike0\ulnone\cf1\par}\pard\plain\itap0\s0\ilvl0\fi0\sb0\sa0\aspalpha\aspnum\adjustright\brdrt\brdrl\brdrb\brdrr\brdrbtw\brdrbar\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql
\faauto\sl276\slmult1\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1{\rtlch\ab0\ai\af2\afs22\ltrch\b0\i\fs22\loch\af2\dbch\af2\hich\f2\insrsid10976062\strike0\ulnone\cf1\par}\pard\plain\itap0\s0\ilvl0\fi0\sb0\sa0\aspalpha
\aspnum\adjustright\brdrt\brdrl\brdrb\brdrr\brdrbtw\brdrbar\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl276\slmult1\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1{\rtlch\ab0\ai\af2\afs22\ltrch\b0\i\fs22\loch\af2\dbch\af2
\hich\f2\insrsid10976062\strike0\ulnone\cf1\par}\trowd\irow0\irowband0\ltrrow\trql\trgaph100\trpaddl100\trpaddfl3\trbrdrt\brdrs\brdrw10\brdrcf0\brsp0\trbrdrl\brdrs\brdrw10\brdrcf0\brsp0\trbrdrb\brdrs\brdrw10\brdrcf0\brsp0\trbrdrr\brdrs\brdrw10\brdrcf0\brsp0\trbrdrh\brdrs\brdrw10\brdrcf0\brsp0\trbrdrv
\brdrs\brdrw10\brdrcf0\brsp0\trwWidth5000\trftsWidth2\trautofit1\trwWidthB0\trftsWidthB3\trpaddt0\trpaddft3\trpaddb0\trpaddfb3\trpaddr108\trpaddfr3\trrh0\trleft-100\clwWidth3120\clftsWidth3\clvertalt\clpadl100\clpadfl3\clpadb100\clpadfb3\clpadt100\clpadft3\clpadr100
\clpadfr3\clbrdrl\brdrs\brdrw20\brdrcf1\brsp0\clbrdrr\brdrs\brdrw20\brdrcf1\brsp0\clbrdrt\brdrs\brdrw20\brdrcf1\brsp0\clbrdrb\brdrs\brdrw20\brdrcf1\brsp0\cldgll\brdrnone\cldglu\brdrnone\cellx3020\clwWidth3120\clftsWidth3\clvertalt\clpadl100\clpadfl3\clpadb100
\clpadfb3\clpadt100\clpadft3\clpadr100\clpadfr3\clbrdrl\brdrs\brdrw20\brdrcf1\brsp0\clbrdrr\brdrs\brdrw20\brdrcf1\brsp0\clbrdrt\brdrs\brdrw20\brdrcf1\brsp0\clbrdrb\brdrs\brdrw20\brdrcf1\brsp0\cldgll\brdrnone\cldglu\brdrnone\cellx6140\clwWidth3120\clftsWidth3
\clvertalt\clpadl100\clpadfl3\clpadb100\clpadfb3\clpadt100\clpadft3\clpadr100\clpadfr3\clbrdrl\brdrs\brdrw20\brdrcf1\brsp0\clbrdrr\brdrs\brdrw20\brdrcf1\brsp0\clbrdrt\brdrs\brdrw20\brdrcf1\brsp0\clbrdrb\brdrs\brdrw20\brdrcf1\brsp0\cldgll\brdrnone\cldglu\brdrnone\cellx9260\tbllkhdrrows\tbllkhdrcols\tbllknocolband\pard\plain\intbl\itap1\s0\ilvl0\fi0\sb0\sa0\aspalpha\aspnum\adjustright\brdrt\brdrl\brdrb\brdrr\brdrbtw\brdrbar\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl240\slmult1\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22
\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1{\rtlch\ab\ai0\af2\alang1025\afs22\ltrch\b\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033\loch\af2\dbch\af2\hich\f2\insrsid10976062\strike0\ulnone\cf1\cell}\pard\plain\intbl\itap1\s0\ilvl0\fi0\sb0\sa0
\aspalpha\aspnum\adjustright\brdrt\brdrl\brdrb\brdrr\brdrbtw\brdrbar\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl240\slmult1\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1{\rtlch\ab\ai0\af2\alang1025\afs22\ltrch\b\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033
\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1 Col}{\rtlch\ab\ai0\af2\alang1025\afs22\ltrch\b\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1  1}{\rtlch\ab\ai0\af2\alang1025\afs22\ltrch\b\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033
\loch\af2\dbch\af2\hich\f2\insrsid10976062\strike0\ulnone\cf1\cell}\pard\plain\intbl\itap1\s0\ilvl0\fi0\sb0\sa0\aspalpha\aspnum\adjustright\brdrt\brdrl\brdrb\brdrr\brdrbtw\brdrbar\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl240\slmult1\rtlch\ab0\ai0\af2\afs22
\ltrch\b0\i0\fs22\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1{\rtlch\ab\ai0\af2\alang1025\afs22\ltrch\b\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1 Col}{\rtlch\ab\ai0\af2\alang1025\afs22\ltrch\b\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033
\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1  2}{\rtlch\ab\ai0\af2\alang1025\afs22\ltrch\b\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033\loch\af2\dbch\af2\hich\f2\insrsid10976062\strike0\ulnone\cf1\cell}{\trowd\irow0\irowband0\ltrrow\trql\trgaph100\trpaddl100\trpaddfl3\trbrdrt\brdrs\brdrw10\brdrcf0\brsp0\trbrdrl\brdrs\brdrw10\brdrcf0\brsp0\trbrdrb\brdrs\brdrw10\brdrcf0\brsp0\trbrdrr\brdrs\brdrw10\brdrcf0\brsp0\trbrdrh\brdrs\brdrw10\brdrcf0\brsp0\trbrdrv
\brdrs\brdrw10\brdrcf0\brsp0\trwWidth5000\trftsWidth2\trautofit1\trwWidthB0\trftsWidthB3\trpaddt0\trpaddft3\trpaddb0\trpaddfb3\trpaddr108\trpaddfr3\trrh0\trleft-100\clwWidth3120\clftsWidth3\clvertalt\clpadl100\clpadfl3\clpadb100\clpadfb3\clpadt100\clpadft3\clpadr100
\clpadfr3\clbrdrl\brdrs\brdrw20\brdrcf1\brsp0\clbrdrr\brdrs\brdrw20\brdrcf1\brsp0\clbrdrt\brdrs\brdrw20\brdrcf1\brsp0\clbrdrb\brdrs\brdrw20\brdrcf1\brsp0\cldgll\brdrnone\cldglu\brdrnone\cellx3020\clwWidth3120\clftsWidth3\clvertalt\clpadl100\clpadfl3\clpadb100
\clpadfb3\clpadt100\clpadft3\clpadr100\clpadfr3\clbrdrl\brdrs\brdrw20\brdrcf1\brsp0\clbrdrr\brdrs\brdrw20\brdrcf1\brsp0\clbrdrt\brdrs\brdrw20\brdrcf1\brsp0\clbrdrb\brdrs\brdrw20\brdrcf1\brsp0\cldgll\brdrnone\cldglu\brdrnone\cellx6140\clwWidth3120\clftsWidth3
\clvertalt\clpadl100\clpadfl3\clpadb100\clpadfb3\clpadt100\clpadft3\clpadr100\clpadfr3\clbrdrl\brdrs\brdrw20\brdrcf1\brsp0\clbrdrr\brdrs\brdrw20\brdrcf1\brsp0\clbrdrt\brdrs\brdrw20\brdrcf1\brsp0\clbrdrb\brdrs\brdrw20\brdrcf1\brsp0\cldgll\brdrnone\cldglu\brdrnone\cellx9260\tbllkhdrrows\tbllkhdrcols\tbllknocolband\rtlch\ab\ai0\af2\alang1025\afs22
\ltrch\b\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033\loch\af2\dbch\af2\hich\f2\insrsid10976062\strike0\ulnone\cf1\row}\trowd\irow1\irowband1\ltrrow\trql\trgaph100\trpaddl100\trpaddfl3\trbrdrt\brdrs\brdrw10\brdrcf0\brsp0\trbrdrl\brdrs\brdrw10\brdrcf0\brsp0\trbrdrb\brdrs\brdrw10\brdrcf0\brsp0\trbrdrr\brdrs\brdrw10\brdrcf0\brsp0\trbrdrh\brdrs\brdrw10\brdrcf0\brsp0\trbrdrv
\brdrs\brdrw10\brdrcf0\brsp0\trwWidth5000\trftsWidth2\trautofit1\trwWidthB0\trftsWidthB3\trpaddt0\trpaddft3\trpaddb0\trpaddfb3\trpaddr108\trpaddfr3\trrh0\trleft-100\clwWidth3120\clftsWidth3\clvertalt\clpadl100\clpadfl3\clpadb100\clpadfb3\clpadt100\clpadft3\clpadr100
\clpadfr3\clbrdrl\brdrs\brdrw20\brdrcf1\brsp0\clbrdrr\brdrs\brdrw20\brdrcf1\brsp0\clbrdrt\brdrs\brdrw20\brdrcf1\brsp0\clbrdrb\brdrs\brdrw20\brdrcf1\brsp0\cldgll\brdrnone\cldglu\brdrnone\cellx3020\clwWidth3120\clftsWidth3\clvertalt\clpadl100\clpadfl3\clpadb100
\clpadfb3\clpadt100\clpadft3\clpadr100\clpadfr3\clbrdrl\brdrs\brdrw20\brdrcf1\brsp0\clbrdrr\brdrs\brdrw20\brdrcf1\brsp0\clbrdrt\brdrs\brdrw20\brdrcf1\brsp0\clbrdrb\brdrs\brdrw20\brdrcf1\brsp0\cldgll\brdrnone\cldglu\brdrnone\cellx6140\clwWidth3120\clftsWidth3
\clvertalt\clpadl100\clpadfl3\clpadb100\clpadfb3\clpadt100\clpadft3\clpadr100\clpadfr3\clbrdrl\brdrs\brdrw20\brdrcf1\brsp0\clbrdrr\brdrs\brdrw20\brdrcf1\brsp0\clbrdrt\brdrs\brdrw20\brdrcf1\brsp0\clbrdrb\brdrs\brdrw20\brdrcf1\brsp0\cldgll\brdrnone\cldglu\brdrnone\cellx9260\tbllkhdrrows\tbllkhdrcols\tbllknocolband\pard\plain\intbl\itap1\s0\ilvl0\fi0\sb0\sa0\aspalpha\aspnum\adjustright\brdrt\brdrl\brdrb\brdrr\brdrbtw\brdrbar\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl240\slmult1\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22
\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1{\rtlch\ab\ai0\af2\alang1025\afs22\ltrch\b\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1 Row}{\rtlch\ab\ai0\af2\alang1025\afs22\ltrch\b\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033
\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1  1}{\rtlch\ab\ai0\af2\alang1025\afs22\ltrch\b\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033\loch\af2\dbch\af2\hich\f2\insrsid10976062\strike0\ulnone\cf1\cell}\pard\plain\intbl\itap1\s0\ilvl0\fi0\sb0\sa0
\aspalpha\aspnum\adjustright\brdrt\brdrl\brdrb\brdrr\brdrbtw\brdrbar\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl240\slmult1\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1{\rtlch\ab0\ai0\af2\alang1025\afs22\ltrch\b0\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033
\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1 1}{\rtlch\ab0\ai0\af2\alang1025\afs22\ltrch\b0\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033\loch\af2\dbch\af2\hich\f2\insrsid10976062\strike0\ulnone\cf1\cell}\pard\plain\intbl\itap1\s0\ilvl0\fi0\sb0\sa0
\aspalpha\aspnum\adjustright\brdrt\brdrl\brdrb\brdrr\brdrbtw\brdrbar\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl240\slmult1\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1{\rtlch\ab0\ai0\af2\alang1025\afs22\ltrch\b0\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033
\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1 2}{\rtlch\ab0\ai0\af2\alang1025\afs22\ltrch\b0\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033\loch\af2\dbch\af2\hich\f2\insrsid10976062\strike0\ulnone\cf1\cell}{\trowd\irow1\irowband1\ltrrow\trql\trgaph100\trpaddl100\trpaddfl3\trbrdrt\brdrs\brdrw10\brdrcf0\brsp0\trbrdrl\brdrs\brdrw10\brdrcf0\brsp0\trbrdrb\brdrs\brdrw10\brdrcf0\brsp0\trbrdrr\brdrs\brdrw10\brdrcf0\brsp0\trbrdrh\brdrs\brdrw10\brdrcf0\brsp0\trbrdrv
\brdrs\brdrw10\brdrcf0\brsp0\trwWidth5000\trftsWidth2\trautofit1\trwWidthB0\trftsWidthB3\trpaddt0\trpaddft3\trpaddb0\trpaddfb3\trpaddr108\trpaddfr3\trrh0\trleft-100\clwWidth3120\clftsWidth3\clvertalt\clpadl100\clpadfl3\clpadb100\clpadfb3\clpadt100\clpadft3\clpadr100
\clpadfr3\clbrdrl\brdrs\brdrw20\brdrcf1\brsp0\clbrdrr\brdrs\brdrw20\brdrcf1\brsp0\clbrdrt\brdrs\brdrw20\brdrcf1\brsp0\clbrdrb\brdrs\brdrw20\brdrcf1\brsp0\cldgll\brdrnone\cldglu\brdrnone\cellx3020\clwWidth3120\clftsWidth3\clvertalt\clpadl100\clpadfl3\clpadb100
\clpadfb3\clpadt100\clpadft3\clpadr100\clpadfr3\clbrdrl\brdrs\brdrw20\brdrcf1\brsp0\clbrdrr\brdrs\brdrw20\brdrcf1\brsp0\clbrdrt\brdrs\brdrw20\brdrcf1\brsp0\clbrdrb\brdrs\brdrw20\brdrcf1\brsp0\cldgll\brdrnone\cldglu\brdrnone\cellx6140\clwWidth3120\clftsWidth3
\clvertalt\clpadl100\clpadfl3\clpadb100\clpadfb3\clpadt100\clpadft3\clpadr100\clpadfr3\clbrdrl\brdrs\brdrw20\brdrcf1\brsp0\clbrdrr\brdrs\brdrw20\brdrcf1\brsp0\clbrdrt\brdrs\brdrw20\brdrcf1\brsp0\clbrdrb\brdrs\brdrw20\brdrcf1\brsp0\cldgll\brdrnone\cldglu\brdrnone\cellx9260\tbllkhdrrows\tbllkhdrcols\tbllknocolband
\rtlch\ab0\ai0\af2\alang1025\afs22\ltrch\b0\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033\loch\af2\dbch\af2\hich\f2\insrsid10976062\strike0\ulnone\cf1\row}\trowd\irow2\irowband2\lastrow\ltrrow\trql\trgaph100\trpaddl100\trpaddfl3\trbrdrt\brdrs\brdrw10\brdrcf0\brsp0\trbrdrl\brdrs\brdrw10\brdrcf0\brsp0\trbrdrb\brdrs\brdrw10\brdrcf0\brsp0\trbrdrr\brdrs\brdrw10\brdrcf0\brsp0\trbrdrh\brdrs\brdrw10\brdrcf0\brsp0
\trbrdrv\brdrs\brdrw10\brdrcf0\brsp0\trwWidth5000\trftsWidth2\trautofit1\trwWidthB0\trftsWidthB3\trpaddt0\trpaddft3\trpaddb0\trpaddfb3\trpaddr108\trpaddfr3\trrh0\trleft-100\clwWidth3120\clftsWidth3\clvertalt\clpadl100\clpadfl3\clpadb100\clpadfb3\clpadt100
\clpadft3\clpadr100\clpadfr3\clbrdrl\brdrs\brdrw20\brdrcf1\brsp0\clbrdrr\brdrs\brdrw20\brdrcf1\brsp0\clbrdrt\brdrs\brdrw20\brdrcf1\brsp0\clbrdrb\brdrs\brdrw20\brdrcf1\brsp0\cldgll\brdrnone\cldglu\brdrnone\cellx3020\clwWidth3120\clftsWidth3\clvertalt\clpadl100
\clpadfl3\clpadb100\clpadfb3\clpadt100\clpadft3\clpadr100\clpadfr3\clbrdrl\brdrs\brdrw20\brdrcf1\brsp0\clbrdrr\brdrs\brdrw20\brdrcf1\brsp0\clbrdrt\brdrs\brdrw20\brdrcf1\brsp0\clbrdrb\brdrs\brdrw20\brdrcf1\brsp0\cldgll\brdrnone\cldglu\brdrnone\cellx6140\clwWidth3120\clftsWidth3
\clvertalt\clpadl100\clpadfl3\clpadb100\clpadfb3\clpadt100\clpadft3\clpadr100\clpadfr3\clbrdrl\brdrs\brdrw20\brdrcf1\brsp0\clbrdrr\brdrs\brdrw20\brdrcf1\brsp0\clbrdrt\brdrs\brdrw20\brdrcf1\brsp0\clbrdrb\brdrs\brdrw20\brdrcf1\brsp0\cldgll\brdrnone\cldglu\brdrnone\cellx9260\tbllkhdrrows\tbllkhdrcols\tbllknocolband\pard\plain\intbl\itap1\s0\ilvl0\fi0\sb0\sa0\aspalpha\aspnum\adjustright\brdrt\brdrl\brdrb\brdrr\brdrbtw\brdrbar\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl240\slmult1\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22
\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1{\rtlch\ab\ai0\af2\alang1025\afs22\ltrch\b\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1 Row}{\rtlch\ab\ai0\af2\alang1025\afs22\ltrch\b\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033
\loch\af2\dbch\af2\hich\f2\strike0\ulnone\cf1  2}{\rtlch\ab\ai0\af2\alang1025\afs22\ltrch\b\i0\fs22\lang1033\langnp1033\langfe1033\langfenp1033\loch\af2\dbch\af2\hich\f2\insrsid10976062\strike0\ulnone\cf1\cell}\pard\plain\intbl\itap1\s0\ilvl0\fi0\sb0\sa0
\aspalpha\aspnum\adjustright\brdrt\brdrl\brdrb\brdrr\brdrbtw\brdrbar\widctlpar\ltrpar\li0\lin0\ri0\rin0\ql\faauto\sl240\slmult1\rtlch\ab0\ai0\af2\afs22\ltrch\b0\i0\fs22\

Download .txt

gitextract_57j5153_/

├── .github/
│   ├── actions/
│   │   └── poetry_setup/
│   │       └── action.yml
│   └── workflows/
│       ├── _lint.yml
│       ├── build_deploy_image.yml
│       └── ci.yml
├── .gitignore
├── API.md
├── CONTRIBUTING.md
├── Dockerfile
├── LICENSE
├── README.md
├── auth.md
├── backend/
│   ├── .gitignore
│   ├── Dockerfile
│   ├── Makefile
│   ├── README.md
│   ├── app/
│   │   ├── __init__.py
│   │   ├── agent.py
│   │   ├── agent_types/
│   │   │   ├── __init__.py
│   │   │   ├── prompts.py
│   │   │   ├── tools_agent.py
│   │   │   └── xml_agent.py
│   │   ├── api/
│   │   │   ├── __init__.py
│   │   │   ├── assistants.py
│   │   │   ├── runs.py
│   │   │   └── threads.py
│   │   ├── auth/
│   │   │   ├── __init__.py
│   │   │   ├── handlers.py
│   │   │   └── settings.py
│   │   ├── chatbot.py
│   │   ├── checkpoint.py
│   │   ├── ingest.py
│   │   ├── lifespan.py
│   │   ├── llms.py
│   │   ├── message_types.py
│   │   ├── parsing.py
│   │   ├── retrieval.py
│   │   ├── schema.py
│   │   ├── server.py
│   │   ├── storage.py
│   │   ├── stream.py
│   │   ├── tools.py
│   │   └── upload.py
│   ├── log_config.json
│   ├── migrations/
│   │   ├── 000001_create_extensions_and_first_tables.down.sql
│   │   ├── 000001_create_extensions_and_first_tables.up.sql
│   │   ├── 000002_checkpoints_update_schema.down.sql
│   │   ├── 000002_checkpoints_update_schema.up.sql
│   │   ├── 000003_create_user.down.sql
│   │   ├── 000003_create_user.up.sql
│   │   ├── 000004_add_metadata_to_thread.down.sql
│   │   ├── 000004_add_metadata_to_thread.up.sql
│   │   ├── 000005_advanced_checkpoints_schema.down.sql
│   │   └── 000005_advanced_checkpoints_schema.up.sql
│   ├── pyproject.toml
│   └── tests/
│       ├── __init__.py
│       └── unit_tests/
│           ├── __init__.py
│           ├── agent_executor/
│           │   ├── __init__.py
│           │   ├── test_parsing.py
│           │   └── test_upload.py
│           ├── app/
│           │   ├── __init__.py
│           │   ├── helpers.py
│           │   ├── test_app.py
│           │   └── test_auth.py
│           ├── conftest.py
│           ├── fixtures/
│           │   ├── __init__.py
│           │   ├── sample.docx
│           │   ├── sample.epub
│           │   ├── sample.html
│           │   ├── sample.odt
│           │   ├── sample.rtf
│           │   └── sample.txt
│           ├── test_imports.py
│           └── utils.py
├── docker-compose-prod.yml
├── docker-compose.yml
├── frontend/
│   ├── .eslintrc.cjs
│   ├── .gitignore
│   ├── Dockerfile
│   ├── README.md
│   ├── index.html
│   ├── package.json
│   ├── postcss.config.js
│   ├── src/
│   │   ├── App.tsx
│   │   ├── api/
│   │   │   ├── assistants.ts
│   │   │   └── threads.ts
│   │   ├── components/
│   │   │   ├── Chat.tsx
│   │   │   ├── ChatList.tsx
│   │   │   ├── Config.tsx
│   │   │   ├── ConfigList.tsx
│   │   │   ├── Document.tsx
│   │   │   ├── FileUpload.tsx
│   │   │   ├── JsonEditor.tsx
│   │   │   ├── LangSmithActions.tsx
│   │   │   ├── Layout.tsx
│   │   │   ├── Message.tsx
│   │   │   ├── MessageEditor.tsx
│   │   │   ├── NewChat.tsx
│   │   │   ├── NotFound.tsx
│   │   │   ├── OrphanChat.tsx
│   │   │   ├── String.tsx
│   │   │   ├── StringEditor.tsx
│   │   │   ├── Tool.tsx
│   │   │   └── TypingBox.tsx
│   │   ├── constants.ts
│   │   ├── hooks/
│   │   │   ├── useChatList.ts
│   │   │   ├── useChatMessages.ts
│   │   │   ├── useConfigList.ts
│   │   │   ├── useMessageEditing.ts
│   │   │   ├── useSchemas.ts
│   │   │   ├── useStatePersist.tsx
│   │   │   ├── useStreamState.tsx
│   │   │   ├── useThreadAndAssistant.ts
│   │   │   └── useToolsSchemas.ts
│   │   ├── index.css
│   │   ├── main.tsx
│   │   ├── types.ts
│   │   ├── utils/
│   │   │   ├── cn.ts
│   │   │   ├── defaults.ts
│   │   │   ├── formTypes.ts
│   │   │   ├── json-refs.d.ts
│   │   │   ├── json-refs.js
│   │   │   ├── simplifySchema.ts
│   │   │   └── str.ts
│   │   └── vite-env.d.ts
│   ├── tailwind.config.js
│   ├── tsconfig.json
│   ├── tsconfig.node.json
│   └── vite.config.ts
└── tools/
    └── redis_to_postgres/
        ├── Dockerfile
        ├── README.md
        ├── docker-compose.yml
        └── migrate_data.py

Download .txt

SYMBOL INDEX (662 symbols across 73 files)

FILE: backend/app/agent.py
  class AgentType (line 62) | class AgentType(str, Enum):
  function get_agent_executor (line 78) | def get_agent_executor(
  class ConfigurableAgent (line 128) | class ConfigurableAgent(RunnableBinding):
    method __init__ (line 138) | def __init__(
  class LLMType (line 185) | class LLMType(str, Enum):
  function get_chatbot (line 197) | def get_chatbot(
  class ConfigurableChatBot (line 224) | class ConfigurableChatBot(RunnableBinding):
    method __init__ (line 229) | def __init__(
  class ConfigurableRetrieval (line 263) | class ConfigurableRetrieval(RunnableBinding):
    method __init__ (line 270) | def __init__(
  function run (line 378) | async def run():

FILE: backend/app/agent_types/tools_agent.py
  function get_tools_agent_executor (line 20) | def get_tools_agent_executor(

FILE: backend/app/agent_types/xml_agent.py
  function _collapse_messages (line 19) | def _collapse_messages(messages):
  function construct_chat_history (line 38) | def construct_chat_history(messages):
  function get_xml_agent_executor (line 62) | def get_xml_agent_executor(

FILE: backend/app/api/__init__.py
  function ok (line 11) | async def ok():

FILE: backend/app/api/assistants.py
  class AssistantPayload (line 14) | class AssistantPayload(BaseModel):
  function list_assistants (line 28) | async def list_assistants(user: AuthedUser) -> List[Assistant]:
  function list_public_assistants (line 34) | async def list_public_assistants() -> List[Assistant]:
  function get_assistant (line 40) | async def get_assistant(
  function create_assistant (line 52) | async def create_assistant(
  function upsert_assistant (line 67) | async def upsert_assistant(
  function delete_assistant (line 83) | async def delete_assistant(

FILE: backend/app/api/runs.py
  class CreateRunPayload (line 21) | class CreateRunPayload(BaseModel):
  function _run_input_and_config (line 31) | async def _run_input_and_config(payload: CreateRunPayload, user_id: str):
  function create_run (line 71) | async def create_run(
  function stream_run (line 83) | async def stream_run(
  function input_schema (line 94) | async def input_schema() -> dict:
  function output_schema (line 100) | async def output_schema() -> dict:
  function config_schema (line 106) | async def config_schema() -> dict:
  class FeedbackCreateRequest (line 114) | class FeedbackCreateRequest(BaseModel):
  function create_run_feedback (line 135) | def create_run_feedback(feedback_create_req: FeedbackCreateRequest) -> d...

FILE: backend/app/api/threads.py
  class ThreadPutRequest (line 18) | class ThreadPutRequest(BaseModel):
  class ThreadPostRequest (line 25) | class ThreadPostRequest(BaseModel):
  function list_threads (line 33) | async def list_threads(user: AuthedUser) -> List[Thread]:
  function get_thread_state (line 39) | async def get_thread_state(
  function add_thread_state (line 58) | async def add_thread_state(
  function get_thread_history (line 79) | async def get_thread_history(
  function get_thread (line 98) | async def get_thread(
  function create_thread (line 110) | async def create_thread(
  function upsert_thread (line 124) | async def upsert_thread(
  function delete_thread (line 139) | async def delete_thread(

FILE: backend/app/auth/handlers.py
  class AuthHandler (line 15) | class AuthHandler(ABC):
    method __call__ (line 17) | async def __call__(self, request: Request) -> User:
  class NOOPAuth (line 21) | class NOOPAuth(AuthHandler):
    method __call__ (line 24) | async def __call__(self, request: Request) -> User:
  class JWTAuthBase (line 30) | class JWTAuthBase(AuthHandler):
    method __call__ (line 31) | async def __call__(self, request: Request) -> User:
    method decode_token (line 44) | def decode_token(self, token: str, decode_key: str) -> dict:
    method get_decode_key (line 48) | def get_decode_key(self, token: str) -> str:
  class JWTAuthLocal (line 52) | class JWTAuthLocal(JWTAuthBase):
    method decode_token (line 55) | def decode_token(self, token: str, decode_key: str) -> dict:
    method get_decode_key (line 65) | def get_decode_key(self, token: str) -> str:
  class JWTAuthOIDC (line 69) | class JWTAuthOIDC(JWTAuthBase):
    method decode_token (line 72) | def decode_token(self, token: str, decode_key: str) -> dict:
    method get_decode_key (line 83) | def get_decode_key(self, token: str) -> str:
    method _decode_complete_unverified (line 90) | def _decode_complete_unverified(self, token: str) -> dict:
    method _get_jwk_client (line 94) | def _get_jwk_client(self, issuer: str) -> jwt.PyJWKClient:
  function get_auth_handler (line 106) | def get_auth_handler() -> AuthHandler:
  function auth_user (line 114) | async def auth_user(

FILE: backend/app/auth/settings.py
  class AuthType (line 10) | class AuthType(Enum):
  class JWTSettingsBase (line 16) | class JWTSettingsBase(BaseSettings):
    method set_aud (line 22) | def set_aud(cls, v) -> Union[str, List[str]]:
  class JWTSettingsLocal (line 32) | class JWTSettingsLocal(JWTSettingsBase):
    method set_decode_key (line 39) | def set_decode_key(cls, v, info):
  class JWTSettingsOIDC (line 51) | class JWTSettingsOIDC(JWTSettingsBase):
  class Settings (line 55) | class Settings(BaseSettings):
    method check_jwt_settings (line 62) | def check_jwt_settings(cls, values):

FILE: backend/app/chatbot.py
  function get_chatbot_executor (line 11) | def get_chatbot_executor(

FILE: backend/app/checkpoint.py
  class AsyncPostgresCheckpoint (line 21) | class AsyncPostgresCheckpoint(BasePostgresSaver):
    method __new__ (line 26) | def __new__(cls, *args, **kwargs):
    method __init__ (line 31) | def __init__(
    method ensure_setup (line 45) | async def ensure_setup(self) -> None:
    method setup (line 51) | async def setup(self) -> None:
    method alist (line 81) | async def alist(
    method aget_tuple (line 95) | async def aget_tuple(self, config: RunnableConfig) -> Optional[Checkpo...
    method aput (line 99) | async def aput(
    method aput_writes (line 111) | async def aput_writes(

FILE: backend/app/ingest.py
  function _update_document_metadata (line 18) | def _update_document_metadata(document: Document, namespace: str) -> None:
  function _sanitize_document_content (line 23) | def _sanitize_document_content(document: Document) -> Document:
  function ingest_blob (line 33) | def ingest_blob(

FILE: backend/app/lifespan.py
  function get_pg_pool (line 14) | def get_pg_pool() -> asyncpg.pool.Pool:
  function _init_connection (line 18) | async def _init_connection(conn) -> None:
  function lifespan (line 37) | async def lifespan(app: FastAPI):

FILE: backend/app/llms.py
  function get_openai_llm (line 18) | def get_openai_llm(model: str = "gpt-3.5-turbo", azure: bool = False):
  function get_anthropic_llm (line 61) | def get_anthropic_llm(bedrock: bool = False):
  function get_google_llm (line 80) | def get_google_llm():
  function get_mixtral_fireworks (line 87) | def get_mixtral_fireworks():
  function get_ollama_llm (line 92) | def get_ollama_llm():

FILE: backend/app/message_types.py
  class LiberalFunctionMessage (line 13) | class LiberalFunctionMessage(FunctionMessage):
  class LiberalToolMessage (line 17) | class LiberalToolMessage(ToolMessage):
  function _convert_pydantic_dict_to_message (line 21) | def _convert_pydantic_dict_to_message(
  function add_messages_liberal (line 35) | def add_messages_liberal(left: Messages, right: Messages):

FILE: backend/app/retrieval.py
  function get_retrieval_executor (line 38) | def get_retrieval_executor(

FILE: backend/app/schema.py
  class User (line 7) | class User(BaseModel):
  class Assistant (line 16) | class Assistant(BaseModel):
  class Thread (line 31) | class Thread(BaseModel):

FILE: backend/app/server.py
  function ingest_files (line 29) | async def ingest_files(
  function health (line 52) | async def health() -> dict:

FILE: backend/app/storage.py
  function list_assistants (line 12) | async def list_assistants(user_id: str) -> List[Assistant]:
  function get_assistant (line 21) | async def get_assistant(user_id: str, assistant_id: str) -> Optional[Ass...
  function list_public_assistants (line 34) | async def list_public_assistants() -> List[Assistant]:
  function put_assistant (line 41) | async def put_assistant(
  function delete_assistant (line 86) | async def delete_assistant(user_id: str, assistant_id: str) -> None:
  function list_threads (line 96) | async def list_threads(user_id: str) -> List[Thread]:
  function get_thread (line 103) | async def get_thread(user_id: str, thread_id: str) -> Optional[Thread]:
  function get_thread_state (line 116) | async def get_thread_state(*, user_id: str, thread_id: str, assistant: A...
  function update_thread_state (line 136) | async def update_thread_state(
  function get_thread_history (line 180) | async def get_thread_history(*, user_id: str, thread_id: str, assistant:...
  function get_assistant_type (line 201) | def get_assistant_type(config: dict) -> str:
  function put_thread (line 213) | async def put_thread(
  function delete_thread (line 250) | async def delete_thread(user_id: str, thread_id: str):
  function get_or_create_user (line 260) | async def get_or_create_user(sub: str) -> tuple[User, bool]:

FILE: backend/app/stream.py
  function astream_state (line 14) | async def astream_state(
  function _default (line 57) | def _default(obj) -> Any:
  function to_sse (line 66) | async def to_sse(messages_stream: MessagesStream) -> AsyncIterator[dict]:

FILE: backend/app/tools.py
  class DDGInput (line 30) | class DDGInput(BaseModel):
  class ArxivInput (line 34) | class ArxivInput(BaseModel):
  class PythonREPLInput (line 38) | class PythonREPLInput(BaseModel):
  class DallEInput (line 42) | class DallEInput(BaseModel):
  class AvailableTools (line 46) | class AvailableTools(str, Enum):
  class ToolConfig (line 62) | class ToolConfig(TypedDict):
  class BaseTool (line 66) | class BaseTool(BaseModel):
  class ActionServerConfig (line 74) | class ActionServerConfig(ToolConfig):
  class ActionServer (line 79) | class ActionServer(BaseTool):
  class Connery (line 95) | class Connery(BaseTool):
  class DDGSearch (line 109) | class DDGSearch(BaseTool):
  class Arxiv (line 117) | class Arxiv(BaseTool):
  class YouSearch (line 125) | class YouSearch(BaseTool):
  class SecFilings (line 133) | class SecFilings(BaseTool):
  class PressReleases (line 141) | class PressReleases(BaseTool):
  class PubMed (line 149) | class PubMed(BaseTool):
  class Wikipedia (line 157) | class Wikipedia(BaseTool):
  class Tavily (line 165) | class Tavily(BaseTool):
  class TavilyAnswer (line 179) | class TavilyAnswer(BaseTool):
  class Retrieval (line 193) | class Retrieval(BaseTool):
  class DallE (line 201) | class DallE(BaseTool):
  function get_retriever (line 214) | def get_retriever(assistant_id: str, thread_id: str):
  function get_retrieval_tool (line 221) | def get_retrieval_tool(assistant_id: str, thread_id: str, description: s...
  function _get_duck_duck_go (line 230) | def _get_duck_duck_go():
  function _get_arxiv (line 235) | def _get_arxiv():
  function _get_you_search (line 240) | def _get_you_search():
  function _get_sec_filings (line 249) | def _get_sec_filings():
  function _get_press_releases (line 260) | def _get_press_releases():
  function _get_pubmed (line 271) | def _get_pubmed():
  function _get_wikipedia (line 278) | def _get_wikipedia():
  function _get_tavily (line 285) | def _get_tavily():
  function _get_tavily_answer (line 291) | def _get_tavily_answer():
  function _get_connery_actions (line 297) | def _get_connery_actions():
  function _get_dalle_tools (line 305) | def _get_dalle_tools():

FILE: backend/app/upload.py
  function _guess_mimetype (line 33) | def _guess_mimetype(file_name: str, file_bytes: bytes) -> str:
  function convert_ingestion_input_to_blob (line 69) | def convert_ingestion_input_to_blob(file: UploadFile) -> Blob:
  function _determine_azure_or_openai_embeddings (line 86) | def _determine_azure_or_openai_embeddings() -> PGVector:
  class IngestRunnable (line 110) | class IngestRunnable(RunnableSerializable[BinaryIO, List[str]]):
    method namespace (line 125) | def namespace(self) -> str:
    method invoke (line 134) | def invoke(self, blob: Blob, config: Optional[RunnableConfig] = None) ...

FILE: backend/migrations/000001_create_extensions_and_first_tables.up.sql
  type assistant (line 4) | CREATE TABLE IF NOT EXISTS assistant (
  type thread (line 13) | CREATE TABLE IF NOT EXISTS thread (
  type checkpoints (line 21) | CREATE TABLE IF NOT EXISTS checkpoints (

FILE: backend/migrations/000003_create_user.up.sql
  type "user" (line 1) | CREATE TABLE IF NOT EXISTS "user" (

FILE: backend/tests/unit_tests/agent_executor/test_parsing.py
  function test_list_of_supported_mimetypes (line 10) | def test_list_of_supported_mimetypes() -> None:
  function test_attempt_to_parse_each_fixture (line 21) | def test_attempt_to_parse_each_fixture() -> None:

FILE: backend/tests/unit_tests/agent_executor/test_upload.py
  function test_ingestion_runnable (line 11) | def test_ingestion_runnable() -> None:
  function test_mimetype_guessing (line 33) | def test_mimetype_guessing() -> None:

FILE: backend/tests/unit_tests/app/helpers.py
  function get_client (line 8) | async def get_client() -> AsyncGenerator[AsyncClient, None]:

FILE: backend/tests/unit_tests/app/test_app.py
  function _project (line 13) | def _project(model: BaseModel, *, exclude_keys: Optional[Sequence[str]] ...
  function test_list_and_create_assistants (line 20) | async def test_list_and_create_assistants(pool: asyncpg.pool.Pool) -> None:
  function test_threads (line 88) | async def test_threads(pool: asyncpg.pool.Pool) -> None:

FILE: backend/tests/unit_tests/app/test_auth.py
  function me (line 22) | async def me(user: AuthedUser) -> dict:
  function _create_jwt (line 26) | def _create_jwt(
  function test_noop (line 32) | async def test_noop():
  function test_jwt_local (line 43) | async def test_jwt_local():
  function test_jwt_oidc (line 77) | async def test_jwt_oidc():

FILE: backend/tests/unit_tests/conftest.py
  function _get_conn (line 23) | async def _get_conn() -> asyncpg.Connection:
  function _create_test_db (line 33) | async def _create_test_db() -> None:
  function _drop_test_db (line 42) | async def _drop_test_db() -> None:
  function _migrate_test_db (line 51) | def _migrate_test_db() -> None:
  function _init_db (line 56) | async def _init_db():
  function pool (line 64) | async def pool(_init_db):
  function clear_test_db (line 72) | async def clear_test_db(pool):
  function event_loop (line 91) | def event_loop(request):

FILE: backend/tests/unit_tests/fixtures/__init__.py
  function get_sample_paths (line 9) | def get_sample_paths() -> List[Path]:

FILE: backend/tests/unit_tests/test_imports.py
  function test_import_app (line 4) | def test_import_app() -> None:

FILE: backend/tests/unit_tests/utils.py
  class InMemoryVectorStore (line 9) | class InMemoryVectorStore(VectorStore):
    method __init__ (line 12) | def __init__(self) -> None:
    method delete (line 16) | def delete(self, ids: Optional[Sequence[str]] = None, **kwargs: Any) -...
    method adelete (line 22) | async def adelete(self, ids: Optional[Sequence[str]] = None, **kwargs:...
    method add_documents (line 28) | def add_documents(
    method aadd_documents (line 53) | async def aadd_documents(
    method add_texts (line 77) | def add_texts(
    method from_texts (line 88) | def from_texts(
    method similarity_search (line 98) | def similarity_search(

FILE: frontend/src/App.tsx
  function App (line 21) | function App(props: { edit?: boolean }) {

FILE: frontend/src/api/assistants.ts
  function getAssistant (line 3) | async function getAssistant(
  function getAssistants (line 18) | async function getAssistants(): Promise<Config[] | null> {

FILE: frontend/src/api/threads.ts
  function getThread (line 3) | async function getThread(threadId: string): Promise<Chat | null> {

FILE: frontend/src/components/Chat.tsx
  type ChatProps (line 17) | interface ChatProps extends Pick<StreamStateProps, "stream" | "stopStrea...
  function usePrevious (line 25) | function usePrevious<T>(value: T): T | undefined {
  function CommitEdits (line 33) | function CommitEdits(props: {
  function Chat (line 65) | function Chat(props: ChatProps) {

FILE: frontend/src/components/ChatList.tsx
  function ChatList (line 9) | function ChatList(props: {

FILE: frontend/src/components/Config.tsx
  function Types (line 27) | function Types(props: {
  function Label (line 90) | function Label(props: { id?: string; title: string; description?: string...
  function StringField (line 106) | function StringField(props: {
  function SingleOptionField (line 135) | function SingleOptionField(props: {
  function ToolSelectionField (line 261) | function ToolSelectionField(props: {
  function PublicLink (line 415) | function PublicLink() {
  function PublicToggle (line 445) | function PublicToggle(props: {
  function fileId (line 475) | function fileId(file: File) {
  constant ORDER (line 479) | const ORDER = [
  function assignDefaults (line 488) | function assignDefaults(
  function Config (line 503) | function Config(props: {

FILE: frontend/src/components/ConfigList.tsx
  function ConfigItem (line 7) | function ConfigItem(props: {
  function ConfigList (line 75) | function ConfigList(props: {

FILE: frontend/src/components/Document.tsx
  function isValidHttpUrl (line 7) | function isValidHttpUrl(str: string) {
  function DocumentViewer (line 19) | function DocumentViewer(props: {
  function DocumentList (line 112) | function DocumentList(props: {

FILE: frontend/src/components/FileUpload.tsx
  function Label (line 33) | function Label(props: { id: string; title: string }) {
  function FileUploadDropzone (line 44) | function FileUploadDropzone(props: {

FILE: frontend/src/components/JsonEditor.tsx
  function JsonEditor (line 7) | function JsonEditor(props: {

FILE: frontend/src/components/LangSmithActions.tsx
  function LangSmithActions (line 9) | function LangSmithActions(props: { runId: string }) {

FILE: frontend/src/components/Layout.tsx
  function Layout (line 5) | function Layout(props: {

FILE: frontend/src/components/Message.tsx
  function isDocumentContent (line 12) | function isDocumentContent(
  function MessageContent (line 21) | function MessageContent(props: { content: MessageType["content"] }) {

FILE: frontend/src/components/MessageEditor.tsx
  function ToolCallEditor (line 15) | function ToolCallEditor(props: {
  function ToolCallsEditor (line 147) | function ToolCallsEditor(props: {
  function MessageContentEditor (line 201) | function MessageContentEditor(props: {

FILE: frontend/src/components/NewChat.tsx
  type NewChatProps (line 14) | interface NewChatProps extends ConfigListProps {
  function NewChat (line 25) | function NewChat(props: NewChatProps) {

FILE: frontend/src/components/NotFound.tsx
  function NotFound (line 1) | function NotFound() {

FILE: frontend/src/components/OrphanChat.tsx
  function OrphanChat (line 7) | function OrphanChat(props: {

FILE: frontend/src/components/String.tsx
  constant OPTIONS (line 5) | const OPTIONS: MarkedOptions = {
  function StringViewer (line 10) | function StringViewer(props: {

FILE: frontend/src/components/StringEditor.tsx
  constant COMMON_CLS (line 3) | const COMMON_CLS = cn(
  function StringEditor (line 7) | function StringEditor(props: {

FILE: frontend/src/components/Tool.tsx
  function ToolRequest (line 6) | function ToolRequest(
  function ToolResponse (line 56) | function ToolResponse(props: {

FILE: frontend/src/components/TypingBox.tsx
  function getFileTypeIcon (line 17) | function getFileTypeIcon(fileType: string) {
  function FileIcon (line 33) | function FileIcon(props: { fileType: string }) {
  function convertBytesToReadableSize (line 37) | function convertBytesToReadableSize(bytes: number) {
  function TypingBox (line 47) | function TypingBox(props: {

FILE: frontend/src/constants.ts
  constant TYPES (line 1) | const TYPES = {
  type TYPE_NAME (line 25) | type TYPE_NAME = (typeof TYPES)[keyof typeof TYPES]["id"];
  constant DROPZONE_CONFIG (line 27) | const DROPZONE_CONFIG = {

FILE: frontend/src/hooks/useChatList.ts
  type ChatListProps (line 5) | interface ChatListProps {
  function chatsReducer (line 16) | function chatsReducer(
  function useChatList (line 31) | function useChatList(): ChatListProps {

FILE: frontend/src/hooks/useChatMessages.ts
  function getState (line 5) | async function getState(threadId: string) {
  function usePrevious (line 14) | function usePrevious<T>(value: T): T | undefined {
  function useChatMessages (line 22) | function useChatMessages(

FILE: frontend/src/hooks/useConfigList.ts
  type Config (line 5) | interface Config {
  type ConfigListProps (line 24) | interface ConfigListProps {
  function configsReducer (line 36) | function configsReducer(
  function useConfigList (line 51) | function useConfigList(): ConfigListProps {

FILE: frontend/src/hooks/useMessageEditing.ts
  function useMessageEditing (line 5) | function useMessageEditing(

FILE: frontend/src/hooks/useSchemas.ts
  type SchemaField (line 5) | interface SchemaField {
  type Schemas (line 14) | interface Schemas {
  function useSchemas (line 31) | function useSchemas() {

FILE: frontend/src/hooks/useStatePersist.tsx
  constant PREFIX (line 3) | const PREFIX = "langgizmo-";
  function useStatePersist (line 5) | function useStatePersist<T>(

FILE: frontend/src/hooks/useStreamState.tsx
  type StreamState (line 6) | interface StreamState {
  type StreamStateProps (line 12) | interface StreamStateProps {
  function useStreamState (line 22) | function useStreamState(): StreamStateProps {
  function mergeMessagesById (line 114) | function mergeMessagesById(

FILE: frontend/src/hooks/useThreadAndAssistant.ts
  function useThreadAndAssistant (line 6) | function useThreadAndAssistant() {

FILE: frontend/src/hooks/useToolsSchemas.ts
  type SchemaItem (line 5) | interface SchemaItem {
  type ConfigSchema (line 15) | interface ConfigSchema {
  function useToolsSchemas (line 29) | function useToolsSchemas() {

FILE: frontend/src/main.tsx
  function getCookie (line 10) | function getCookie(name: string) {

FILE: frontend/src/types.ts
  type ToolCall (line 1) | interface ToolCall {
  type MessageDocument (line 7) | interface MessageDocument {
  type Message (line 12) | interface Message {
  type Chat (line 22) | interface Chat {

FILE: frontend/src/utils/cn.ts
  function cn (line 5) | function cn(...inputs: ClassValue[]) {

FILE: frontend/src/utils/defaults.ts
  function getDefaults (line 209) | function getDefaults(

FILE: frontend/src/utils/formTypes.ts
  type MessageWithFiles (line 1) | type MessageWithFiles = {
  type Tool (line 6) | interface Tool {
  type ToolConfig (line 14) | interface ToolConfig {
  type ToolSchema (line 18) | interface ToolSchema {
  type PropertySchema (line 26) | interface PropertySchema {
  type ToolConfigSchema (line 32) | interface ToolConfigSchema {

FILE: frontend/src/utils/json-refs.js
  function r (line 5) | function r(e) {
  function e (line 88) | function e(t) {
  function r (line 113) | function r(t) {
  function r (line 141) | function r(t) {
  function r (line 236) | function r(t) {
  function e (line 265) | function e(t) {
  function i (line 293) | function i() {
  function u (line 296) | function u() {
  function c (line 299) | function c(t) {
  function p (line 329) | function p() {
  function h (line 334) | function h() {
  function v (line 360) | function v(t, n) {
  function d (line 363) | function d() {}
  function a (line 434) | function a(t) {
  function e (line 538) | function e(t) {
  function e (line 580) | function e(t) {
  function i (line 597) | function i(t) {
  function u (line 616) | function u(t, n) {
  function c (line 619) | function c(t, n) {
  function a (line 622) | function a(t, n, r, e) {
  function s (line 631) | function s(t, n, r, e) {
  function f (line 641) | function f(t, n) {
  function s (line 963) | function s(t) {
  function a (line 989) | function a(t) {
  function e (line 1030) | function e(t) {
  function e (line 1116) | function e(t) {
  function e (line 1153) | function e(t) {
  function r (line 1188) | function r(t) {
  function r (line 1281) | function r(t) {
  function t (line 1378) | function t() {}
  function u (line 1459) | function u(t) {
  function o (line 1580) | function o() {
  function o (line 1695) | function o(t) {
  function i (line 1714) | function i() {}
  function s (line 1750) | function s(t, r) {
  function e (line 1790) | function e(t) {
  function s (line 1812) | function s(t) {
  function u (line 1856) | function u(t, e) {
  function c (line 1859) | function c(t) {
  function a (line 1867) | function a() {
  function f (line 1876) | function f(t, n, r) {
  function l (line 1890) | function l(t) {
  function p (line 1919) | function p(t) {
  function h (line 1928) | function h(t, n, r, e) {
  function v (line 1936) | function v(t) {
  function d (line 1939) | function d(t) {
  function y (line 1946) | function y(t) {
  function o (line 1986) | function o(t, n) {
  function d (line 2088) | function d(t, n) {
  function y (line 2119) | function y(t) {
  function _ (line 2122) | function _(t) {
  function g (line 2125) | function g(t, n) {
  function b (line 2136) | function b(t) {
  function m (line 2141) | function m(t) {
  function w (line 2156) | function w(t, n) {
  function x (line 2182) | function x(t, n) {
  function j (line 2193) | function j(t) {
  function E (line 2198) | function E(t, n) {
  function S (line 2201) | function S(t) {
  function O (line 2204) | function O(t, n, r) {
  function A (line 2207) | function A(t, n) {
  function T (line 2303) | function T(t) {
  function C (line 2312) | function C(t) {
  function I (line 2321) | function I(t, n) {
  function P (line 2379) | function P(t) {
  function k (line 2412) | function k(t, n) {
  function R (line 2430) | function R(t) {
  function D (line 2439) | function D(t, n) {
  function U (line 2445) | function U(t, n) {
  function i (line 2758) | function i(t) {
  function cn (line 3034) | function cn(t, n, r) {
  function an (line 3047) | function an(t, n, r, e) {
  function sn (line 3054) | function sn(t, n) {
  function fn (line 3062) | function fn(t, n) {
  function ln (line 3066) | function ln(t, n) {
  function pn (line 3071) | function pn(t, n) {
  function hn (line 3082) | function hn(t, n) {
  function vn (line 3085) | function vn(t, n, r) {
  function dn (line 3090) | function dn(t, n) {
  function yn (line 3099) | function yn(t, n) {
  function _n (line 3104) | function _n(t, n, r, e) {
  function gn (line 3110) | function gn(t, n, r, e) {
  function bn (line 3115) | function bn(t, n) {
  function wn (line 3121) | function wn(t, n, r) {
  function xn (line 3130) | function xn(t, n, r, e) {
  function jn (line 3135) | function jn(t, n, r) {
  function En (line 3145) | function En(t, n, r, e) {
  function Sn (line 3149) | function Sn(t) {
  function On (line 3152) | function On(t, n) {
  function An (line 3156) | function An(t) {
  function Tn (line 3161) | function Tn(t) {
  function Cn (line 3166) | function Cn(t, n, r, e, o) {
  function In (line 3174) | function In(t, n) {
  function Pn (line 3181) | function Pn(t, n) {
  function kn (line 3185) | function kn(t) {
  function Rn (line 3190) | function Rn(t, n) {
  function Dn (line 3195) | function Dn(t, n) {
  function Un (line 3198) | function Un(t, n) {
  function Nn (line 3202) | function Nn(t, n) {
  function zn (line 3206) | function zn(t, n) {
  function qn (line 3409) | function qn(t) {
  function Mn (line 3412) | function Mn(t) {
  function $n (line 3415) | function $n(t) {
  function Bn (line 3425) | function Bn(t, n) {
  function Hn (line 3430) | function Hn(t, n) {
  function Wn (line 3437) | function Wn(t) {
  function Vn (line 3447) | function Vn(t) {
  function Gn (line 3457) | function Gn(t) {
  function Zn (line 3466) | function Zn(t) {
  function Cr (line 3567) | function Cr(t) {
  function t (line 3575) | function t() {}
  function Pr (line 3584) | function Pr() {}
  function kr (line 3585) | function kr(t, n) {
  function Rr (line 3592) | function Rr(t) {
  function Dr (line 3601) | function Dr(t) {
  function Ur (line 3609) | function Ur(t) {
  function Nr (line 3617) | function Nr(t) {
  function zr (line 3625) | function zr(t) {
  function Fr (line 3630) | function Fr(t) {
  function Lr (line 3634) | function Lr(t, n) {
  function qr (line 3655) | function qr(t) {
  function Mr (line 3659) | function Mr(t, n) {
  function $r (line 3662) | function $r(t) {
  function Br (line 3665) | function Br(t, n, r) {
  function Hr (line 3669) | function Hr(t, n, r) {
  function Wr (line 3674) | function Wr(t, n) {
  function Vr (line 3678) | function Vr(t, n, r, e) {
  function Gr (line 3686) | function Gr(t, n) {
  function Zr (line 3689) | function Zr(t, n, r) {
  function Jr (line 3699) | function Jr(t, n) {
  function Xr (line 3704) | function Xr(t, n, r) {
  function Kr (line 3712) | function Kr(t, n, r, e, o, i) {
  function Yr (line 3816) | function Yr(t, n, r) {
  function Qr (line 3827) | function Qr(t, n, r) {
  function te (line 3833) | function te(t, n, r, e) {
  function ee (line 3979) | function ee(t, n) {
  function oe (line 3988) | function oe(t, n, r) {
  function ie (line 3998) | function ie(t, n) {
  function ue (line 4007) | function ue(t, n, r, e, o) {
  function se (line 4022) | function se(t, n) {
  function fe (line 4025) | function fe(t, n) {
  function le (line 4028) | function le(t, n) {
  function pe (line 4033) | function pe(t, n) {
  function he (line 4038) | function he(t, n, r) {
  function ve (line 4042) | function ve(t) {
  function de (line 4063) | function de(t, n) {
  function ye (line 4066) | function ye(t, n) {
  function _e (line 4069) | function _e(t, n) {
  function ge (line 4072) | function ge(t, n, r) {
  function be (line 4108) | function be(t, n, r) {
  function me (line 4112) | function me(t) {
  function we (line 4115) | function we(t, n, r, e, o) {
  function xe (line 4243) | function xe(t, n, r, e) {
  function je (line 4266) | function je(t) {
  function Ee (line 4273) | function Ee(t) {
  function Se (line 4284) | function Se(t) {
  function Oe (line 4291) | function Oe(t) {
  function Ae (line 4304) | function Ae(t, n) {
  function Te (line 4307) | function Te(t, n) {
  function Ce (line 4317) | function Ce(t) {
  function Ie (line 4325) | function Ie(t, n) {
  function Pe (line 4333) | function Pe(t, n, r, e, o) {
  function ke (line 4379) | function ke(t, n) {
  function Re (line 4383) | function Re(t, n, r) {
  function De (line 4422) | function De(t, n, r) {
  function Ue (line 4430) | function Ue(t, n, r, e) {
  function Ne (line 4444) | function Ne(t, n) {
  function ze (line 4454) | function ze(t, n) {
  function Fe (line 4457) | function Fe(t, n) {
  function Le (line 4465) | function Le(t, n) {
  function qe (line 4468) | function qe(t) {
  function Me (line 4471) | function Me(t, n) {
  function $e (line 4475) | function $e(t, n, r, e) {
  function We (line 4508) | function We(t) {
  function Ve (line 4511) | function Ve(t, n, r) {
  function Ge (line 4521) | function Ge(t, n) {
  function Ze (line 4530) | function Ze(t, n, r) {
  function Je (line 4545) | function Je(t, n, r, e) {
  function Xe (line 4576) | function Xe(t, n) {
  function Ke (line 4587) | function Ke(t) {
  function Ye (line 4590) | function Ye(t) {
  function Qe (line 4597) | function Qe(t, n, r) {
  function to (line 4620) | function to(t, n) {
  function no (line 4623) | function no(t, n, r, e) {
  function ro (line 4626) | function ro(t, n, r, e) {
  function eo (line 4636) | function eo(t, n) {
  function oo (line 4649) | function oo(t, n, r) {
  function io (line 4657) | function io(t, n, r) {
  function uo (line 4664) | function uo(t) {
  function co (line 4667) | function co(t) {
  function ao (line 4670) | function ao(t, n) {
  function fo (line 4674) | function fo(t, n, r) {
  function po (line 4683) | function po(t, n) {
  function ho (line 4689) | function ho(t) {
  function vo (line 4693) | function vo(t, n) {
  function yo (line 4697) | function yo(t, n) {
  function _o (line 4726) | function _o(t, n, r, o) {
  function go (line 4744) | function go(t, n, r, o) {
  function bo (line 4763) | function bo(t, n) {
  function mo (line 4769) | function mo(t, n, r, e) {
  function wo (line 4779) | function wo(t, n) {
  function xo (line 4786) | function xo(t) {
  function jo (line 4805) | function jo(t, n) {
  function Eo (line 4817) | function Eo(t) {
  function So (line 4826) | function So(t) {
  function Oo (line 4834) | function Oo(t) {
  function Ao (line 4839) | function Ao(t) {
  function To (line 4865) | function To(t) {
  function Co (line 4879) | function Co(t) {
  function Io (line 4909) | function Io(t, n, r, o, i, u, c, a, s, f) {
  function Po (line 4942) | function Po(t, n) {
  function ko (line 4954) | function ko(t, n) {
  function Ro (line 4968) | function Ro(t) {
  function Do (line 4981) | function Do(t, n) {
  function Uo (line 4987) | function Uo(t) {
  function No (line 5005) | function No(t) {
  function zo (line 5014) | function zo(t, n, r, e, o, i, u, c, a, s) {
  function Fo (line 5032) | function Fo(t) {
  function qo (line 5054) | function qo(t) {
  function Mo (line 5068) | function Mo(t, n, r, o, i, a, s, f) {
  function $o (line 5190) | function $o(t, n, r, e) {
  function Bo (line 5193) | function Bo(t, n, r, e, o, i) {
  function Ho (line 5201) | function Ho(t) {
  function Wo (line 5204) | function Wo(t, n, r, e, o, i) {
  function Vo (line 5240) | function Vo(t) {
  function Go (line 5243) | function Go(t) {
  function Zo (line 5246) | function Zo(t) {
  function Xo (line 5254) | function Xo(t) {
  function Ko (line 5266) | function Ko(t) {
  function Yo (line 5269) | function Yo() {
  function Qo (line 5276) | function Qo(t, n) {
  function ti (line 5291) | function ti(t) {
  function ni (line 5299) | function ni(t, n) {
  function ii (line 5322) | function ii(t, n, r) {
  function ui (line 5335) | function ui(t) {
  function ci (line 5338) | function ci(t) {
  function ai (line 5341) | function ai(t, n) {
  function si (line 5351) | function si(t, n, r) {
  function fi (line 5360) | function fi(t, n) {
  function li (line 5376) | function li(t) {
  function hi (line 5409) | function hi(t) {
  function vi (line 5413) | function vi(t) {
  function di (line 5416) | function di(t, n) {
  function yi (line 5421) | function yi(t, n, r) {
  function _i (line 5437) | function _i(t, n) {
  function gi (line 5440) | function gi(t, n) {
  function bi (line 5447) | function bi(t, n) {
  function ji (line 5461) | function ji(t, n, r) {
  function Ei (line 5494) | function Ei(t) {
  function Si (line 5506) | function Si(t, n) {
  function Ai (line 5533) | function Ai(t) {
  function Ti (line 5538) | function Ti(t) {
  function Ci (line 5549) | function Ci(t) {
  function Ri (line 5576) | function Ri(t, n, r) {
  function Di (line 5582) | function Di(t, n, r) {
  function Ui (line 5592) | function Ui(t) {
  function Ni (line 5595) | function Ni(t) {
  function qi (line 5618) | function qi(t) {
  function $i (line 5623) | function $i(t, n) {
  function Hi (line 5639) | function Hi(t) {
  function Zi (line 5656) | function Zi(t) {
  function Ji (line 5668) | function Ji(t, n) {
  function ru (line 5702) | function ru(t) {
  function eu (line 5706) | function eu(t, n) {
  function au (line 5734) | function au(t, n) {
  function su (line 5737) | function su(t, n) {
  function hu (line 5757) | function hu(t, n) {
  function _u (line 5783) | function _u(t, n, r) {
  function gu (line 5797) | function gu(t, n) {
  function wu (line 5827) | function wu(t, n, r) {
  function Eu (line 5895) | function Eu(t, n) {
  function Su (line 5908) | function Su(t) {
  function Iu (line 5946) | function Iu(t, n) {
  function Nu (line 5968) | function Nu(t) {
  function zu (line 5971) | function zu(t) {
  function qu (line 5980) | function qu(t) {
  function Mu (line 5991) | function Mu(t) {
  function $u (line 6001) | function $u(t) {
  function Bu (line 6004) | function Bu(t) {
  function Hu (line 6012) | function Hu(t) {
  function Wu (line 6016) | function Wu(t) {
  function Gu (line 6024) | function Gu(t) {
  function Zu (line 6027) | function Zu(t) {
  function Ku (line 6044) | function Ku(t) {
  function Yu (line 6047) | function Yu(t) {
  function rc (line 6059) | function rc(t) {
  function ec (line 6070) | function ec(t) {
  function oc (line 6081) | function oc(t) {
  function ic (line 6086) | function ic(t) {
  function uc (line 6089) | function uc(t) {
  function cc (line 6105) | function cc(t) {
  function ac (line 6108) | function ac(t) {
  function yc (line 6142) | function yc(t, n, r) {
  function _c (line 6146) | function _c(t, n) {
  function wc (line 6158) | function wc(t) {
  function xc (line 6161) | function xc(t) {
  function Ac (line 6191) | function Ac(t, n) {
  function Ic (line 6205) | function Ic(t) {
  function kc (line 6211) | function kc(t) {
  function Rc (line 6214) | function Rc(t) {
  function Mc (line 6234) | function Mc(t, n, r) {
  function Hc (line 6265) | function Hc(t) {
  function Gc (line 6272) | function Gc(t) {
  function Zc (line 6275) | function Zc(t) {
  function Kc (line 6288) | function Kc(t, n, r) {
  function Yc (line 6318) | function Yc() {}
  function ra (line 6322) | function ra(t) {
  function ia (line 6333) | function ia() {
  function ua (line 6336) | function ua() {
  function a (line 8004) | function a(t) {
  function r (line 8070) | function r(t) {
  function e (line 8206) | function e(t) {
  function o (line 8725) | function o(t, n) {
  function i (line 9210) | function i(t) {
  function u (line 9222) | function u(t) {
  function i (line 9282) | function i(o) {
  function s (line 9398) | function s(t) {
  function r (line 9425) | function r(t, n) {
  function e (line 9437) | function e(t, n) {
  function e (line 9490) | function e(t) {
  function e (line 9602) | function e(t) {
  function e (line 9695) | function e(t) {
  function f (line 9724) | function f() {}
  function h (line 9765) | function h(t) {
  function v (line 9771) | function v(t, n, r) {
  function d (line 9781) | function d(t) {
  function y (line 9790) | function y(t) {
  function _ (line 9793) | function _(t) {
  function g (line 9830) | function g(t, n) {
  function b (line 9874) | function b(t, n, r) {
  function e (line 10146) | function e(t) {
  function r (line 10163) | function r() {
  function e (line 10210) | function e(t) {
  function i (line 10226) | function i(t) {
  function o (line 10504) | function o(t) {
  function r (line 10578) | function r() {
  function i (line 10620) | function i(t, n) {
  function h (line 10749) | function h(t) {
  function v (line 10752) | function v(t) {
  function e (line 10794) | function e(t, n) {
  function e (line 10831) | function e(t) {
  function u (line 10882) | function u(t, n) {
  function c (line 10906) | function c(t) {
  function n (line 10924) | function n() {
  function r (line 10935) | function r(t) {
  function e (line 10938) | function e(t) {
  function o (line 10951) | function o(t) {
  function i (line 10954) | function i(t) {
  function y (line 11193) | function y(t) {
  function _ (line 11196) | function _(t, n) {
  function g (line 11206) | function g(t) {
  function O (line 11365) | function O(t) {
  function A (line 11383) | function A(t) {
  function T (line 11406) | function T(t, n) {
  function C (line 11446) | function C(t) {
  function I (line 11449) | function I(t, n) {
  function P (line 11454) | function P(t, n) {
  function D (line 11503) | function D(t) {
  function U (line 11570) | function U(t, n) {
  function q (line 11593) | function q(t) {
  function M (line 11607) | function M(t) {
  function $ (line 11653) | function $(t, n) {
  function B (line 11704) | function B(t, n) {
  function rt (line 11768) | function rt(t) {

FILE: frontend/src/utils/simplifySchema.ts
  function simplifySchema (line 6) | function simplifySchema(schema: any) {

FILE: frontend/src/utils/str.ts
  function str (line 1) | function str(o: unknown): React.ReactNode {

FILE: tools/redis_to_postgres/migrate_data.py
  function keys (line 42) | def keys(match: str) -> Iterator[str]:
  function load (line 52) | def load(keys: list[str], values: list[bytes]) -> dict:
  class RedisCheckpoint (line 56) | class RedisCheckpoint(BaseCheckpointSaver):
    method config_specs (line 60) | def config_specs(self) -> list[ConfigurableFieldSpec]:
    method _dump (line 80) | def _dump(self, mapping: dict[str, Any]) -> dict:
    method _load (line 85) | def _load(self, mapping: dict[bytes, bytes]) -> dict:
    method _hash_key (line 91) | def _hash_key(self, config: RunnableConfig) -> str:
    method get (line 96) | def get(self, config: RunnableConfig) -> Checkpoint | None:
    method put (line 117) | def put(self, config: RunnableConfig, checkpoint: Checkpoint) -> None:
  function migrate_assistants (line 121) | async def migrate_assistants(conn: asyncpg.Connection) -> None:
  function migrate_threads (line 150) | async def migrate_threads(conn: asyncpg.Connection) -> None:
  function migrate_checkpoints (line 179) | async def migrate_checkpoints() -> None:
  function migrate_embeddings (line 202) | async def migrate_embeddings(conn: asyncpg.Connection) -> None:
  function migrate_data (line 267) | async def migrate_data():
  function main (line 277) | async def main():

Download .json

Condensed preview — 132 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (2,041K chars).

[
  {
    "path": ".github/actions/poetry_setup/action.yml",
    "chars": 3292,
    "preview": "# An action for setting up poetry install with caching.\n# Using a custom action since the default action does not\n# take"
  },
  {
    "path": ".github/workflows/_lint.yml",
    "chars": 2614,
    "preview": "name: lint\n\non:\n  workflow_call:\n    inputs:\n      working-directory:\n        required: true\n        type: string\n      "
  },
  {
    "path": ".github/workflows/build_deploy_image.yml",
    "chars": 942,
    "preview": "name: Build, Push, and Deploy Open GPTS\n\non:\n  push:\n    branches: [main]\n  workflow_dispatch:\n\njobs:\n  build-and-push:\n"
  },
  {
    "path": ".github/workflows/ci.yml",
    "chars": 3231,
    "preview": "---\nname: CI\n\non:\n  push:\n    branches: [main]\n  pull_request: # Trigger on all PRs, ensuring required actions to be run"
  },
  {
    "path": ".gitignore",
    "chars": 725,
    "preview": "*.env\n.env.gcp.yaml\npostgres-volume/\nredis-volume/\nbackend/ui\n\n# Operating System generated files\n.DS_Store\nThumbs.db\neh"
  },
  {
    "path": "API.md",
    "chars": 9253,
    "preview": "# API Getting Started\n\nThis documentation covers how to get started with the API that backs OpenGPTs.\nThis allows you to"
  },
  {
    "path": "CONTRIBUTING.md",
    "chars": 1022,
    "preview": "# Contributing\n\n## Contributor License Agreement\n\nWe are grateful to the contributors who help evolve OpenGPTs and dedic"
  },
  {
    "path": "Dockerfile",
    "chars": 1194,
    "preview": "FROM node:20 AS builder\n\nWORKDIR /frontend\n\nCOPY ./frontend/package.json ./frontend/yarn.lock ./\n\nRUN yarn --network-tim"
  },
  {
    "path": "LICENSE",
    "chars": 1066,
    "preview": "MIT License\n\nCopyright (c) LangChain, Inc.\n\nPermission is hereby granted, free of charge, to any person obtaining a copy"
  },
  {
    "path": "README.md",
    "chars": 18028,
    "preview": "# OpenGPTs\n\nThis is an open source effort to create a similar experience to OpenAI's GPTs and Assistants API.\nIt is powe"
  },
  {
    "path": "auth.md",
    "chars": 1574,
    "preview": "# Auth\n\nBy default, we're using cookies as a mock auth method. It's for trying out OpenGPTs.\nFor production, we recommen"
  },
  {
    "path": "backend/.gitignore",
    "chars": 10,
    "preview": ".envrc\nui\n"
  },
  {
    "path": "backend/Dockerfile",
    "chars": 948,
    "preview": "# Backend Dockerfile\nFROM python:3.11\n\nARG TARGETOS\nARG TARGETARCH\nARG TARGETVARIANT\n\n# Install system dependencies\nRUN "
  },
  {
    "path": "backend/Makefile",
    "chars": 2166,
    "preview": ".PHONY: all lint format test help\n\n# Default target executed when no arguments are given to make.\nall: help\n\nbuild_ui:\n\t"
  },
  {
    "path": "backend/README.md",
    "chars": 1318,
    "preview": "# backend\n\n## Database Migrations\n\n### Migration 5 - Checkpoint Management Update\nThis migration introduces a significan"
  },
  {
    "path": "backend/app/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "backend/app/agent.py",
    "chars": 12146,
    "preview": "from enum import Enum\nfrom typing import Any, Dict, Mapping, Optional, Sequence, Union\n\nfrom langchain_core.messages imp"
  },
  {
    "path": "backend/app/agent_types/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "backend/app/agent_types/prompts.py",
    "chars": 772,
    "preview": "xml_template = \"\"\"{system_message}\n\nYou have access to the following tools:\n\n{tools}\n\nIn order to use a tool, you can us"
  },
  {
    "path": "backend/app/agent_types/tools_agent.py",
    "chars": 4543,
    "preview": "from typing import cast\n\nfrom langchain.tools import BaseTool\nfrom langchain_core.language_models.base import LanguageMo"
  },
  {
    "path": "backend/app/agent_types/xml_agent.py",
    "chars": 5716,
    "preview": "from langchain.tools import BaseTool\nfrom langchain.tools.render import render_text_description\nfrom langchain_core.lang"
  },
  {
    "path": "backend/app/api/__init__.py",
    "chars": 545,
    "preview": "from fastapi import APIRouter\n\nfrom app.api.assistants import router as assistants_router\nfrom app.api.runs import route"
  },
  {
    "path": "backend/app/api/assistants.py",
    "chars": 2277,
    "preview": "from typing import Annotated, List\nfrom uuid import uuid4\n\nfrom fastapi import APIRouter, HTTPException, Path\nfrom pydan"
  },
  {
    "path": "backend/app/api/runs.py",
    "chars": 5119,
    "preview": "from typing import Any, Dict, Optional, Sequence, Union\nfrom uuid import UUID\n\nimport langsmith.client\nfrom fastapi impo"
  },
  {
    "path": "backend/app/api/threads.py",
    "chars": 4106,
    "preview": "from typing import Annotated, Any, Dict, List, Optional, Sequence, Union\nfrom uuid import uuid4\n\nfrom fastapi import API"
  },
  {
    "path": "backend/app/auth/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "backend/app/auth/handlers.py",
    "chars": 3841,
    "preview": "from abc import ABC, abstractmethod\nfrom functools import lru_cache\nfrom typing import Annotated\n\nimport jwt\nimport requ"
  },
  {
    "path": "backend/app/auth/settings.py",
    "chars": 2304,
    "preview": "import os\nfrom base64 import b64decode\nfrom enum import Enum\nfrom typing import List, Optional, Union\n\nfrom pydantic imp"
  },
  {
    "path": "backend/app/chatbot.py",
    "chars": 844,
    "preview": "from typing import Annotated, List\n\nfrom langchain_core.language_models.base import LanguageModelLike\nfrom langchain_cor"
  },
  {
    "path": "backend/app/checkpoint.py",
    "chars": 4000,
    "preview": "import os\nfrom typing import Any, AsyncIterator, Optional, Sequence\n\nimport structlog\nfrom langgraph.checkpoint.base imp"
  },
  {
    "path": "backend/app/ingest.py",
    "chars": 1829,
    "preview": "\"\"\"Code to ingest blob into a vectorstore.\n\nCode is responsible for taking binary data, parsing it and then indexing it\n"
  },
  {
    "path": "backend/app/lifespan.py",
    "chars": 1713,
    "preview": "import os\nfrom contextlib import asynccontextmanager\n\nimport asyncpg\nimport orjson\nimport structlog\nfrom fastapi import "
  },
  {
    "path": "backend/app/llms.py",
    "chars": 3315,
    "preview": "import os\nfrom functools import lru_cache\nfrom urllib.parse import urlparse\n\nimport boto3\nimport httpx\nimport structlog\n"
  },
  {
    "path": "backend/app/message_types.py",
    "chars": 1190,
    "preview": "from typing import Any\n\nfrom langchain_core.messages import (\n    FunctionMessage,\n    MessageLikeRepresentation,\n    To"
  },
  {
    "path": "backend/app/parsing.py",
    "chars": 819,
    "preview": "\"\"\"Module contains logic for parsing binary blobs into text.\"\"\"\nfrom langchain_community.document_loaders.parsers import"
  },
  {
    "path": "backend/app/retrieval.py",
    "chars": 4966,
    "preview": "import operator\nfrom typing import Annotated, List, Sequence, TypedDict\nfrom uuid import uuid4\n\nfrom langchain_core.lang"
  },
  {
    "path": "backend/app/schema.py",
    "chars": 1106,
    "preview": "from datetime import datetime\nfrom typing import Optional\n\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    u"
  },
  {
    "path": "backend/app/server.py",
    "chars": 1944,
    "preview": "import os\nfrom pathlib import Path\n\nimport orjson\nimport structlog\nfrom fastapi import FastAPI, Form, UploadFile\nfrom fa"
  },
  {
    "path": "backend/app/storage.py",
    "chars": 8713,
    "preview": "from datetime import datetime, timezone\nfrom typing import Any, List, Optional, Sequence, Union\n\nfrom langchain_core.mes"
  },
  {
    "path": "backend/app/stream.py",
    "chars": 3654,
    "preview": "import functools\nfrom typing import Any, AsyncIterator, Dict, Optional, Sequence, Union\n\nimport orjson\nimport structlog\n"
  },
  {
    "path": "backend/app/tools.py",
    "chars": 10759,
    "preview": "from enum import Enum\nfrom functools import lru_cache\nfrom typing import Annotated, Literal\n\nfrom langchain.tools.retrie"
  },
  {
    "path": "backend/app/upload.py",
    "chars": 5606,
    "preview": "\"\"\"API to deal with file uploads via a runnable.\n\nFor now this code assumes that the content is a base64 encoded string."
  },
  {
    "path": "backend/log_config.json",
    "chars": 1886,
    "preview": "{\n    \"version\": 1,\n    \"disable_existing_loggers\": false,\n    \"formatters\": {\n        \"default\": {\n            \"()\": \"u"
  },
  {
    "path": "backend/migrations/000001_create_extensions_and_first_tables.down.sql",
    "chars": 95,
    "preview": "DROP TABLE IF EXISTS thread;\nDROP TABLE IF EXISTS assistant;\nDROP TABLE IF EXISTS checkpoints;\n"
  },
  {
    "path": "backend/migrations/000001_create_extensions_and_first_tables.up.sql",
    "chars": 824,
    "preview": "CREATE EXTENSION IF NOT EXISTS vector;\nCREATE EXTENSION IF NOT EXISTS \"uuid-ossp\";\n\nCREATE TABLE IF NOT EXISTS assistant"
  },
  {
    "path": "backend/migrations/000002_checkpoints_update_schema.down.sql",
    "chars": 179,
    "preview": "ALTER TABLE checkpoints\n    DROP CONSTRAINT IF EXISTS checkpoints_pkey,\n    ADD PRIMARY KEY (thread_id),\n    DROP COLUMN"
  },
  {
    "path": "backend/migrations/000002_checkpoints_update_schema.up.sql",
    "chars": 346,
    "preview": "ALTER TABLE checkpoints\n    ADD COLUMN IF NOT EXISTS thread_ts TIMESTAMPTZ,\n    ADD COLUMN IF NOT EXISTS parent_ts TIMES"
  },
  {
    "path": "backend/migrations/000003_create_user.down.sql",
    "chars": 274,
    "preview": "ALTER TABLE assistant\n    DROP CONSTRAINT fk_assistant_user_id,\n    ALTER COLUMN user_id TYPE VARCHAR USING (user_id::te"
  },
  {
    "path": "backend/migrations/000003_create_user.up.sql",
    "chars": 851,
    "preview": "CREATE TABLE IF NOT EXISTS \"user\" (\n    user_id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),\n    sub VARCHAR(255) UNIQUE"
  },
  {
    "path": "backend/migrations/000004_add_metadata_to_thread.down.sql",
    "chars": 40,
    "preview": "ALTER TABLE thread\nDROP COLUMN metadata;"
  },
  {
    "path": "backend/migrations/000004_add_metadata_to_thread.up.sql",
    "chars": 260,
    "preview": "ALTER TABLE thread\nADD COLUMN metadata JSONB;\n\nUPDATE thread\nSET metadata = json_build_object(\n    'assistant_type', (SE"
  },
  {
    "path": "backend/migrations/000005_advanced_checkpoints_schema.down.sql",
    "chars": 448,
    "preview": "-- Drop the blob storage table\nDROP TABLE IF EXISTS checkpoint_blobs;\n\n-- Drop the writes tracking table\nDROP TABLE IF E"
  },
  {
    "path": "backend/migrations/000005_advanced_checkpoints_schema.up.sql",
    "chars": 574,
    "preview": "-- BREAKING CHANGE WARNING:\n-- This migration represents a transition from pickle-based checkpointing to a new checkpoin"
  },
  {
    "path": "backend/pyproject.toml",
    "chars": 2505,
    "preview": "[tool.poetry]\nname = \"opengpts\"\nversion = \"0.1.0\"\ndescription = \"\"\nauthors = [\"Your Name <you@example.com>\"]\nreadme = \"R"
  },
  {
    "path": "backend/tests/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "backend/tests/unit_tests/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "backend/tests/unit_tests/agent_executor/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "backend/tests/unit_tests/agent_executor/test_parsing.py",
    "chars": 1413,
    "preview": "\"\"\"Test parsing logic.\"\"\"\nimport mimetypes\n\nfrom langchain_community.document_loaders import Blob\n\nfrom app.parsing impo"
  },
  {
    "path": "backend/tests/unit_tests/agent_executor/test_upload.py",
    "chars": 1655,
    "preview": "from io import BytesIO\n\nfrom fastapi import UploadFile\nfrom langchain.text_splitter import RecursiveCharacterTextSplitte"
  },
  {
    "path": "backend/tests/unit_tests/app/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "backend/tests/unit_tests/app/helpers.py",
    "chars": 342,
    "preview": "from contextlib import asynccontextmanager\n\nfrom httpx import AsyncClient\nfrom typing_extensions import AsyncGenerator\n\n"
  },
  {
    "path": "backend/tests/unit_tests/app/test_app.py",
    "chars": 4470,
    "preview": "\"\"\"Test the server and client together.\"\"\"\n\nfrom typing import Optional, Sequence\nfrom uuid import uuid4\n\nimport asyncpg"
  },
  {
    "path": "backend/tests/unit_tests/app/test_auth.py",
    "chars": 3140,
    "preview": "from base64 import b64encode\nfrom datetime import datetime, timedelta, timezone\nfrom typing import Optional\nfrom unittes"
  },
  {
    "path": "backend/tests/unit_tests/conftest.py",
    "chars": 2643,
    "preview": "import asyncio\nimport os\nimport subprocess\n\nimport asyncpg\nimport pytest\n\nfrom app.auth.settings import AuthType\nfrom ap"
  },
  {
    "path": "backend/tests/unit_tests/fixtures/__init__.py",
    "chars": 201,
    "preview": "from pathlib import Path\nfrom typing import List\n\nHERE = Path(__file__).parent\n\n# PUBLIC API\n\n\ndef get_sample_paths() ->"
  },
  {
    "path": "backend/tests/unit_tests/fixtures/sample.html",
    "chars": 7314,
    "preview": "<html><head><meta content=\"text/html; charset=UTF-8\" http-equiv=\"content-type\"><style type=\"text/css\">.lst-kix_n6n0tzfwn"
  },
  {
    "path": "backend/tests/unit_tests/fixtures/sample.rtf",
    "chars": 1310503,
    "preview": "{\\rtf1\\ansi\\ansicpg1252\\uc0\\stshfdbch0\\stshfloch0\\stshfhich0\\stshfbi0\\deff0\\adeff0{\\fonttbl{\\f0\\froman\\fcharset0\\fprq2{\\"
  },
  {
    "path": "backend/tests/unit_tests/fixtures/sample.txt",
    "chars": 230,
    "preview": "🦜️ LangChain\r\n\r\n\r\n\r\n\r\nUnderline\r\n\r\n\r\nBold\r\n\r\n\r\nItalics\r\n\r\n\r\n\r\n\r\n\r\n\r\n\tCol 1\r\n\tCol 2\r\n\tRow 1\r\n\t1\r\n\t2\r\n\tRow 2\r\n\t3\r\n\t4\r\n\t\r\n"
  },
  {
    "path": "backend/tests/unit_tests/test_imports.py",
    "chars": 156,
    "preview": "\"\"\"Shallow tests that make sure we can at least import the code.\"\"\"\n\n\ndef test_import_app() -> None:\n    \"\"\"Test import "
  },
  {
    "path": "backend/tests/unit_tests/utils.py",
    "chars": 3417,
    "preview": "\"\"\"Test ingestion utilities.\"\"\"\nfrom typing import Any, Dict, Iterable, List, Optional, Sequence, Type\n\nfrom langchain.s"
  },
  {
    "path": "docker-compose-prod.yml",
    "chars": 1042,
    "preview": "version: \"3\"\n\nservices:\n  postgres:\n    image: pgvector/pgvector:pg16\n    healthcheck:\n      test: pg_isready -U $POSTGR"
  },
  {
    "path": "docker-compose.yml",
    "chars": 1412,
    "preview": "version: \"3\"\n\nservices:\n  postgres:\n    image: pgvector/pgvector:pg16\n    healthcheck:\n      test: pg_isready -U $POSTGR"
  },
  {
    "path": "frontend/.eslintrc.cjs",
    "chars": 481,
    "preview": "module.exports = {\n  root: true,\n  env: { browser: true, es2020: true },\n  extends: [\n    \"eslint:recommended\",\n    \"plu"
  },
  {
    "path": "frontend/.gitignore",
    "chars": 253,
    "preview": "# Logs\nlogs\n*.log\nnpm-debug.log*\nyarn-debug.log*\nyarn-error.log*\npnpm-debug.log*\nlerna-debug.log*\n\nnode_modules\ndist\ndis"
  },
  {
    "path": "frontend/Dockerfile",
    "chars": 366,
    "preview": "# Frontend Dockerfile\nFROM node:20\n\n# Set the working directory\nWORKDIR /frontend\n\n# Copy the package.json and yarn.lock"
  },
  {
    "path": "frontend/README.md",
    "chars": 1263,
    "preview": "# React + TypeScript + Vite\n\nThis template provides a minimal setup to get React working in Vite with HMR and some ESLin"
  },
  {
    "path": "frontend/index.html",
    "chars": 362,
    "preview": "<!doctype html>\n<html lang=\"en\">\n  <head>\n    <meta charset=\"UTF-8\" />\n    <meta name=\"viewport\" content=\"width=device-w"
  },
  {
    "path": "frontend/package.json",
    "chars": 1575,
    "preview": "{\n  \"name\": \"frontend\",\n  \"private\": true,\n  \"version\": \"0.0.0\",\n  \"packageManager\": \"yarn@1.22.19\",\n  \"type\": \"module\","
  },
  {
    "path": "frontend/postcss.config.js",
    "chars": 80,
    "preview": "export default {\n  plugins: {\n    tailwindcss: {},\n    autoprefixer: {},\n  },\n}\n"
  },
  {
    "path": "frontend/src/App.tsx",
    "chars": 5872,
    "preview": "import { useCallback, useState } from \"react\";\nimport { InformationCircleIcon } from \"@heroicons/react/24/outline\";\nimpo"
  },
  {
    "path": "frontend/src/api/assistants.ts",
    "chars": 725,
    "preview": "import { Config } from \"../hooks/useConfigList\";\n\nexport async function getAssistant(\n  assistantId: string,\n): Promise<"
  },
  {
    "path": "frontend/src/api/threads.ts",
    "chars": 366,
    "preview": "import { Chat } from \"../types\";\n\nexport async function getThread(threadId: string): Promise<Chat | null> {\n  try {\n    "
  },
  {
    "path": "frontend/src/components/Chat.tsx",
    "chars": 5570,
    "preview": "import { useEffect, useRef, useState } from \"react\";\nimport { StreamStateProps } from \"../hooks/useStreamState\";\nimport "
  },
  {
    "path": "frontend/src/components/ChatList.tsx",
    "chars": 6350,
    "preview": "import { PlusIcon, EllipsisVerticalIcon } from \"@heroicons/react/24/outline\";\nimport { useState, useEffect } from \"react"
  },
  {
    "path": "frontend/src/components/Config.tsx",
    "chars": 24730,
    "preview": "import { Fragment, useCallback, useEffect, useState } from \"react\";\nimport { ShareIcon } from \"@heroicons/react/24/outli"
  },
  {
    "path": "frontend/src/components/ConfigList.tsx",
    "chars": 3984,
    "preview": "import { TYPES } from \"../constants\";\nimport { Config, ConfigListProps } from \"../hooks/useConfigList\";\nimport { cn } fr"
  },
  {
    "path": "frontend/src/components/Document.tsx",
    "chars": 3737,
    "preview": "import { useMemo, useState } from \"react\";\nimport { ChevronDownIcon, ChevronRightIcon } from \"@heroicons/react/24/outlin"
  },
  {
    "path": "frontend/src/components/FileUpload.tsx",
    "chars": 2907,
    "preview": "import { useMemo } from \"react\";\nimport { DropzoneState } from \"react-dropzone\";\nimport { XCircleIcon } from \"@heroicons"
  },
  {
    "path": "frontend/src/components/JsonEditor.tsx",
    "chars": 2122,
    "preview": "import CodeMirror from \"@uiw/react-codemirror\";\nimport { json } from \"@codemirror/lang-json\";\nimport { EditorView, keyma"
  },
  {
    "path": "frontend/src/components/LangSmithActions.tsx",
    "chars": 1865,
    "preview": "import {\n  HandThumbDownIcon,\n  HandThumbUpIcon,\n  EllipsisHorizontalIcon,\n  CheckIcon,\n} from \"@heroicons/react/24/outl"
  },
  {
    "path": "frontend/src/components/Layout.tsx",
    "chars": 4722,
    "preview": "import { Fragment } from \"react\";\nimport { Dialog, Transition } from \"@headlessui/react\";\nimport { Bars3Icon, XMarkIcon "
  },
  {
    "path": "frontend/src/components/Message.tsx",
    "chars": 3817,
    "preview": "import { memo, useState } from \"react\";\nimport { MessageDocument, Message as MessageType } from \"../types\";\nimport { str"
  },
  {
    "path": "frontend/src/components/MessageEditor.tsx",
    "chars": 9052,
    "preview": "import { memo } from \"react\";\nimport type { Message, ToolCall } from \"../types\";\nimport { str } from \"../utils/str\";\nimp"
  },
  {
    "path": "frontend/src/components/NewChat.tsx",
    "chars": 2450,
    "preview": "import { ConfigList } from \"./ConfigList\";\nimport { Schemas } from \"../hooks/useSchemas\";\nimport TypingBox from \"./Typin"
  },
  {
    "path": "frontend/src/components/NotFound.tsx",
    "chars": 68,
    "preview": "export function NotFound() {\n  return <div>Page not found.</div>;\n}\n"
  },
  {
    "path": "frontend/src/components/OrphanChat.tsx",
    "chars": 3836,
    "preview": "import { useEffect, useState } from \"react\";\nimport { Config } from \"../hooks/useConfigList\";\nimport { Chat } from \"../t"
  },
  {
    "path": "frontend/src/components/String.tsx",
    "chars": 619,
    "preview": "import { MarkedOptions, marked } from \"marked\";\nimport DOMPurify from \"dompurify\";\nimport { cn } from \"../utils/cn\";\n\nco"
  },
  {
    "path": "frontend/src/components/StringEditor.tsx",
    "chars": 1315,
    "preview": "import { cn } from \"../utils/cn\";\n\nconst COMMON_CLS = cn(\n  \"text-sm col-[1] row-[1] m-0 resize-none overflow-hidden whi"
  },
  {
    "path": "frontend/src/components/Tool.tsx",
    "chars": 2962,
    "preview": "import { ChevronDownIcon } from \"@heroicons/react/24/outline\";\nimport { ToolCall } from \"../types\";\nimport { str } from "
  },
  {
    "path": "frontend/src/components/TypingBox.tsx",
    "chars": 7247,
    "preview": "import {\n  PaperAirplaneIcon,\n  ChatBubbleLeftIcon,\n  XCircleIcon,\n  DocumentPlusIcon,\n  DocumentTextIcon,\n  DocumentIco"
  },
  {
    "path": "frontend/src/constants.ts",
    "chars": 1750,
    "preview": "export const TYPES = {\n  agent: {\n    id: \"agent\",\n    title: \"Assistant\",\n    description:\n      \"These GPTs can use an"
  },
  {
    "path": "frontend/src/hooks/useChatList.ts",
    "chars": 2412,
    "preview": "import { useCallback, useEffect, useReducer } from \"react\";\nimport orderBy from \"lodash/orderBy\";\nimport { Chat } from \""
  },
  {
    "path": "frontend/src/hooks/useChatMessages.ts",
    "chars": 2091,
    "preview": "import { useCallback, useEffect, useMemo, useRef, useState } from \"react\";\nimport { Message } from \"../types\";\nimport { "
  },
  {
    "path": "frontend/src/hooks/useConfigList.ts",
    "chars": 3106,
    "preview": "import { useCallback, useEffect, useReducer } from \"react\";\nimport orderBy from \"lodash/orderBy\";\nimport { getAssistants"
  },
  {
    "path": "frontend/src/hooks/useMessageEditing.ts",
    "chars": 1271,
    "preview": "import { useCallback, useState } from \"react\";\nimport { Message } from \"../types\";\nimport { omit } from \"lodash\";\n\nexpor"
  },
  {
    "path": "frontend/src/hooks/useSchemas.ts",
    "chars": 1078,
    "preview": "import { useEffect, useState } from \"react\";\nimport { simplifySchema } from \"../utils/simplifySchema\";\nimport { getDefau"
  },
  {
    "path": "frontend/src/hooks/useStatePersist.tsx",
    "chars": 563,
    "preview": "import React, { useEffect, useState } from \"react\";\n\nconst PREFIX = \"langgizmo-\";\n\nexport function useStatePersist<T>(\n "
  },
  {
    "path": "frontend/src/hooks/useStreamState.tsx",
    "chars": 3877,
    "preview": "/* eslint-disable @typescript-eslint/no-explicit-any */\nimport { useCallback, useState } from \"react\";\nimport { fetchEve"
  },
  {
    "path": "frontend/src/hooks/useThreadAndAssistant.ts",
    "chars": 1380,
    "preview": "import { useQuery, useQueryClient } from \"react-query\";\nimport { useParams } from \"react-router-dom\";\nimport { getAssist"
  },
  {
    "path": "frontend/src/hooks/useToolsSchemas.ts",
    "chars": 1790,
    "preview": "import { useEffect, useState } from \"react\";\nimport { ToolConfigSchema, ToolSchema } from \"../utils/formTypes.ts\";\nimpor"
  },
  {
    "path": "frontend/src/index.css",
    "chars": 131,
    "preview": "@tailwind base;\n@tailwind components;\n@tailwind utilities;\n\nhtml,\nbody,\n#root {\n  height: 100%;\n}\n\nbody {\n  background: "
  },
  {
    "path": "frontend/src/main.tsx",
    "chars": 1762,
    "preview": "import ReactDOM from \"react-dom/client\";\nimport { v4 as uuidv4 } from \"uuid\";\nimport App from \"./App.tsx\";\nimport \"./ind"
  },
  {
    "path": "frontend/src/types.ts",
    "chars": 561,
    "preview": "export interface ToolCall {\n  id: string;\n  name: string;\n  args: Record<string, unknown>;\n}\n\nexport interface MessageDo"
  },
  {
    "path": "frontend/src/utils/cn.ts",
    "chars": 183,
    "preview": "import clsx from \"clsx\";\nimport { ClassValue } from \"clsx\";\nimport { twMerge } from \"tailwind-merge\";\n\nexport function c"
  },
  {
    "path": "frontend/src/utils/defaults.ts",
    "chars": 5351,
    "preview": "// (c) 2015 Chute Corporation. Released under the terms of the MIT License.\n// Modified to use TypeScript and handle edg"
  },
  {
    "path": "frontend/src/utils/formTypes.ts",
    "chars": 658,
    "preview": "export type MessageWithFiles = {\n  message: string;\n  files: File[];\n};\n\nexport interface Tool {\n  id: string;\n  type: s"
  },
  {
    "path": "frontend/src/utils/json-refs.d.ts",
    "chars": 140,
    "preview": "/* eslint-disable @typescript-eslint/no-explicit-any */\nexport const JsonRefs: {\n  resolveRefs(schema: any): Promise<{ r"
  },
  {
    "path": "frontend/src/utils/json-refs.js",
    "chars": 367917,
    "preview": "// Original source: https://github.com/whitlockjc/json-refs/blob/master/dist/json-refs-min.js\n\nexport const JsonRefs = ("
  },
  {
    "path": "frontend/src/utils/simplifySchema.ts",
    "chars": 869,
    "preview": "/* eslint-disable @typescript-eslint/no-explicit-any */\nimport { JsonRefs } from \"./json-refs\";\n\n// jsonforms doesn't su"
  },
  {
    "path": "frontend/src/utils/str.ts",
    "chars": 147,
    "preview": "export function str(o: unknown): React.ReactNode {\n  return typeof o === \"object\"\n    ? JSON.stringify(o, null, 2)\n    :"
  },
  {
    "path": "frontend/src/vite-env.d.ts",
    "chars": 38,
    "preview": "/// <reference types=\"vite/client\" />\n"
  },
  {
    "path": "frontend/tailwind.config.js",
    "chars": 415,
    "preview": "/* eslint-disable no-undef */\n/** @type {import('tailwindcss').Config} */\nimport defaultTheme from \"tailwindcss/defaultT"
  },
  {
    "path": "frontend/tsconfig.json",
    "chars": 605,
    "preview": "{\n  \"compilerOptions\": {\n    \"target\": \"ES2020\",\n    \"useDefineForClassFields\": true,\n    \"lib\": [\"ES2020\", \"DOM\", \"DOM."
  },
  {
    "path": "frontend/tsconfig.node.json",
    "chars": 213,
    "preview": "{\n  \"compilerOptions\": {\n    \"composite\": true,\n    \"skipLibCheck\": true,\n    \"module\": \"ESNext\",\n    \"moduleResolution\""
  },
  {
    "path": "frontend/vite.config.ts",
    "chars": 474,
    "preview": "import { defineConfig } from \"vite\";\nimport react from \"@vitejs/plugin-react\";\n\n// https://vitejs.dev/config/\nexport def"
  },
  {
    "path": "tools/redis_to_postgres/Dockerfile",
    "chars": 198,
    "preview": "FROM langchain/open-gpts:latest\n\nRUN poetry add redis==5.0.1\n\nCOPY migrate_data.py .\n\n# Run database schema migrations a"
  },
  {
    "path": "tools/redis_to_postgres/README.md",
    "chars": 927,
    "preview": "OpenGPTs previously used Redis for data persistence, but has since switched to Postgres. If you have data in Redis that "
  },
  {
    "path": "tools/redis_to_postgres/docker-compose.yml",
    "chars": 491,
    "preview": "version: \"3\"\n\nservices:\n  redis:\n    image: redis/redis-stack-server:latest\n    ports:\n      - \"6380:6379\"\n    volumes:\n"
  },
  {
    "path": "tools/redis_to_postgres/migrate_data.py",
    "chars": 9984,
    "preview": "import asyncio\nimport json\nimport logging\nimport os\nimport pickle\nimport struct\nimport uuid\nfrom collections import defa"
  }
]

// ... and 3 more files (download for full content)

About this extraction

This page contains the full source code of the langchain-ai/opengpts GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 132 files (1.9 MB), approximately 419.6k tokens, and a symbol index with 662 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo